Thesis Title: Reducing Rule Generation Complexity in the Prism Algorithm
Author: Mohammed Yaseen Hammadi, Supervisor: Dr. Rashid Al-Zubaidy, Year: 2014
Faculty: Information Technology, Department: Computer Science

Abstract: Data mining is a computer science field that works on finding the relations and patterns that are found within data (training data), by detecting these relations and patterns, rules can be created, that can be used later on to filter out and process future data (test data). Prism is an easy covering algorithm depending on separate-and-conquer algorithms; this algorithm creates rules by discovering the power of relations between attribute items with the objective class. The primary phase in Prism is the rule creation phase, where the relation power between every attribute item and the targeted class is computed in every assumption, then the set of training data is classified based on the output results. One of the primary pillars in the rule generation step is that whenever Prism detects two equal strength values, Prism selects one value only and drops the other to the next iteration of computations and filtering. However, observations and experiments over more than one data set proved that in each time the other equal value is always chosen in the next iteration, this is obviously unneeded system overhead. In this thesis, we aim to remove this redundancy in rule generation phase by presenting the enhanced prism (E-Prism) algorithm. Another insufficiency in this algorithm is that it deals with only categorical attributes, some discretization methods where previously used with Prism, but the problem with these techniques is their large calculation complexity compared to other discretization methods. This thesis aims overcome this problem by using discretization methods with small complexity as a pre-processing phase allowing Prism to deal with continuous attributes.

Keywords: Data Mining, Prism, E-Prism Algorithm

Thesis Title: Denotational Semantics for Cloud# Language
Author: Yehia Moustafa Abd Al-Rahman, Supervisor: Dr. Mourad Maouche, Year: 2013
Faculty: Information Technology, Department: Computer Science

Abstract: CloudMDE is considered as one of the most significant research areas in software development nowadays. It has attracted an increasing attention from the research community. CloudMDE aims at identifying opportunities for making Cloud Computing benefits from model-driven engineering techniques and vice versa. Cloud# Language has been proposed as a way for using model-driven engineering techniques to support Cloud Computing. It is a domain-specific modeling language for modeling the infrastructure of the cloud. Could# is an imperative language with a textual concrete syntax. It manipulates the cloud infrastructure components as first class citizens. Furthermore, it supports concurrency and event-driven actions. Until now a BNF abstract syntax, a concrete syntax and an informal semantics description for Cloud# language are available. However, this language lacks a formal semantics for Cloud# language. Object-Z language has been used as a meta-language for defining the formal semantics of Cloud# in a single unified framework. That is, the abstract syntax, static and dynamic semantics of a single language construct are specified in one Object-Z class. Not only does this help the readability of the semantic, but if the language is enhanced or evolved, the required modifications can be done by minimal disruption of the existing semantics. Also it is possible to use some parts of semantics definition of one language to define another. On the other hand, the consistency checking for Cloud# language has been done using an Object-Z type-checker tool. A sample Cloud# model has been converted to the Object-Z specifications and then applied along with the existing formal denotational semantics to the type-checker. No typing errors have been found which indicates the consistency of Cloud# language,

Keywords: CloudMDE, Cloud Computing, Cloud# Language, Object-Z

Thesis Title: Discovering User Attitudes of Business in Twitter Language Feed
Author: Ibrahim Mohammed Saleh AbdulNabi, Supervisor: Dr. Samir Tartir, Year: 2013
Faculty: Information Technology, Department: Computer Science

Abstract: Research done on Arabic sentiment analysis is considered very limited almost in its early steps compared to other languages like English whether at business level or social level. Twitter now can be considered as an information network instead of a social network; this is because Twitter is a platform for shared experiences and it is a very human network .Twitter, a micro-blogging service, has emerged as a new medium in spotlight through recent happenings. Sentiment analysis aims to determine the attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a document. The attitude may be his or her judgment or evaluation. Thesis problems are discovering user attitudes and business insights from Arabic Twitter feed and Enhancing Arabic Natural Language Processing using novel Semantic Web techniques. The Motivation for our work is Social media have swept into every industry and business function and are now an important factor of production, The primary goal of our work is to help companies make better business decisions by enabling semantic web and data mining. We proposes a new model to discover Arabic business insights from tweets that may affect the decision making process, This model classifies Arabic tweets into positive and negative and tries to enhance the classification precision of tweets by creating Arabic Sentiment Ontology (ASO) for first time and also building our ASO aggregate weights algorithm. Our Solution will be a starting point for any research that focus on Arabic sentiment analysis and extract information from Arabic social media.

Keywords: Arabic sentiment analysis, Twitter, Natural Language Processing, Arabic Sentiment Ontology, Social Media

Thesis Title: New Rule Based Classification Algorithm for Automobile insurance Fraud Detection
Author: Ahmad Okleh Ali AL-Ali, Supervisor: Dr. Fadi Fayez Thabtah, Year: 2013
Faculty: Information Technology, Department: Computer Science

Abstract: Automobile Insurance Fraud (AIF) is a significant and costly problem for both policyholders and insurance companies. The fraudulent activities may affect negatively on the profits of automobile insurance companies. Data mining especially rule based classification algorithms can contribute in helping the detection of fraudulent activities. In these algorithms the output is represented in simple interpreted "If-Then" knowledge and stored in a knowledge base. However, the problem of rule based classification such as (PRISM) generates large number of rules. Since maintaining and understanding these classifier rules depend on classifiers size which is hard by the typical end user. Moreover, some correlation rules in (PRISM) that near perfection ones can't be extracted. These disappeared rules in competitive environment are considered very significant in the prediction phase. On the other hand, induction rule based algorithm i.e. Repeated Incremental Pruning to Produce Error Reduction (RIPPER) have small size classifiers with often low accuracy, these rules is not feasible regarding to the (AIF) classification problem, because some knowledge are undetected. This thesis investigates the applicability of strength threshold based covering method on the problem of detection the accident type in order to make balance in producing the number of generated rules without impacting on the classification rate. The new algorithm named Strength Threshold Based Coverage Prism (STBCP) that makes balance, (as a result on average size classifiers) in producing the rules.

Keywords: (AIF) Automobile Insurance Fraud, PRISM, STBCP

Thesis Title: Development of Multi-Label Classification Algorithm Based on Correlations among Labels
Author: Raed Hasan Saleh Diab, Supervisor: Dr. Fadi Fayez Thabtah, Year: 2013
Faculty: Information Technology, Department: Computer Science

Abstract: Multi label classification is concerned with learning from set of instances that are associated with a set of labels, that is, an instance could be associated with multiple labels at the same time. This task occurs frequently in application areas like text categorization, multimedia classification, bioinformatics, protein function classification and semantic scene classification. Current multi-label classification methods could be divided into two parts. The first part is called problem transformation methods, which transform multi-label classification problem into single label classification problem, and then apply any single label classifier to solve the problem. The second part is called algorithm adaptation methods, which adapt an existing single label classification algorithm to handle multi-label data. In this thesis, we propose a multi-label classification algorithm based on correlations among labels that use both problem transformation methods and algorithm adaptation method. The algorithm begins with transforming multi-label dataset into single label dataset using least frequent label criteria, and then applies PART algorithm on the transformed dataset. Also the algorithm tries to get benefit from positive correlations among labels using predictive Apriori algorithm. The output of the algorithm is multilabels rules. The algorithm has been evaluated using two multi-label datasets ("Emotions"," Yeast") and three evaluation measures (Accuracy, Hamming Loss, Harmonic Mean). Further, we show by experiments that this algorithm has a fair accuracy comparing with other related algorithms.

Keywords: Multi Label Classification, Accuracy, Hamming Loss, Harmonic Mean