Text Processing Services

Urdu Part-of-Speech (POS) Tagging Service

CLE Urdu Part of Speech (POS) tagging service assigns POS tags such as noun, verb, adjective and adverb to each word/token of the input text.

For complete list of CLE POS tagset Click here.


Urdu Content Profanity Estimation Service

CLE Urdu Profanity Estimation Service computes profanity score of the input text based on a lexicon of inappropriate Urdu words depending on cultural and social context. The profanity score is a number corrected to two decimal places which lies between 0 and 1. The more closer profanity score to 1, the more profane is the content.


Language Identification Service

CLE Urdu Language Identification Service identifies the presence of Urdu in multilingual content having Arabic, English and Urdu text and computes the proportion of Urdu present in the input text. This service returns a decimal score corrected to two decimal places, computed on the basis of proportion of Urdu text. The score lies between 0 and 1.The more closer the score to 1 the more Urdu words in the input text.


Urdu Text Summarization Service

CLE Urdu text summarization service takes Urdu text as input and produces its abridged version as output using an extractive summarization technique. The length of the summary depends on compression ratio specified by the user.


Urdu Domain Identification Service

CLE Urdu domain identification service classifies documents into a set of predefined categories. Currently, the set of categories include National News, International News,Business,Sports,Science,Health and Showbiz.


Urdu Spell Checking Service

CLE Urdu Spell Check Service provides assistance for verifying spellings of Urdu Language. It accepts Urdu text as input and checks it for spelling errors. If an error is identified, it generates a ranked list of suggested words.


Roman to Urdu Script Service

CLE Roman to Urdu script service converts input roman text to Urdu script.


Aspect Based Sentiment Analysis Service

Urdu Sentiment Analysis Service allows the users to extract the opinion orientation of the input sentences. It has multiple applications such as customers reviews analysis, popularity analysis of electoral candidates, hate speech detection and much more. This service can be used in two modes i.e., as data analyst or as A.I. Specialist. As data analyst, one can simply upload his data and get that labelled using available models within few seconds. Whereas, A.I. Specialist mode enables the users to create customized machine learning models using their own labeled data.


Urdu Keyword Extraction Service

CLE Urdu Keyword Extraction service helps to extract keywords from the Input Urdu text as per the user request.


Urdu Stemmer

CLE Urdu stemmer web service provides the stemmed version of an input Urdu text content. Following the set of rules developed in CLE and discussed in the paper titled "Assas-band, an Affix-Exception-List Based Urdu Stemmer", we have made Urdu stemmer as a web service for the students and researchers working in the area Urdu NLP.


Urdu Segmentation

The CLE Urdu word segmentation service accepts non-standardized text with irregular spacing between Urdu words and delivers sentences with properly separated words. A sentence is broken into ligatures and then submitted to a language model that has been thoroughly trained to generate the optimal sentence based on the provided Urdu ligatures.