Introduction
Welcome to the Turkish.AI API.
Our API aims to help you rapid and seamless integration of Turkish Language Processing capabilities in your application.
We provide a single endpoint for a combined call to enable various NLP processing layers on the same document.
If you have any queries, please feel free to contact us at [email protected].
Authentication
To authorize, inlcude -api-key in your HTTP Request Header:
# With shell, you can just pass the correct header with each request
curl "https://api.turkish.ai/core"
-H "api-key: 3b12d5a3-2d15-4239-ac01-7f0be69cb92c"
Make sure to replace
3b12d5a3-2d15-4239-ac01-7f0be69cb92cwith your actual API key.
Turkish.AI uses API keys to allow access to the API. You can register for a new Turkish.AI API key at our site or acccess your API key by clicking here if you are member.
We expect your API key to be included in all http requests to the server as an api-key header:
api-key: 3b12d5a3-2d15-4239-ac01-7f0be69cb92c
Core Service
curl "https://api.turkish.ai/core"
-H "api-key: 1234567890"
-H "content-type: application/json"
-d "{text: 'Bugün hava çok güzel. Zaten Ali söylemişti.'}"
The endpoint for our core service is:
https://api.turkish.ai/core
This endpoint applies all processors in the process list to the input text.
HTTP Request
POST https://api.turkish.ai/core
Parameters
All parameters must be packaged in one JSON data structure.
| Parameter | Required | Description |
|---|---|---|
| text | Yes | The UTF-8 encoded document text that will be processed. Each request size cannot exceed 10 Kilobytes in total. |
| dct | No | Document creation datetime (required for relative temporal expressions). If omitted, the default value is current date time. |
| processors | No | The list of processors that can be selected. If omitted, all available processors will be applied. |
Processors
There are different types of processors, you can choose some or all of them. The usage of these processors does not consume your API rate limitings. However, the more processors you choose, the longer it takes to process your text. So, it is better to choose the required processors wisely for your application performance.
| Processor | Description |
|---|---|
| text_normalizer | Normalization of the input text |
| morph_analyzer | Morphological Analysis and morphological disambiguation |
| parser | Dependency parser |
| ner | Named entity recognition |
| sentiment_analyzer | Sentiment analysis (both for each sentence and for the whole text) |
| text_categorizer | Text categorization |
| keyword_extractor | Keyword extraction |
Response
{
"text": "Bugün hava çok güzel. Zaten Ali söylemişti.",
"dct": "2023-01-04 12:23",
"processing_time_in_seconds": 0.36,
"token_count": 9,
"sentences": [
{
"text": "Bugün hava çok güzel .",
"tokens": [
{
"index": 0,
"text": "Bugün",
"analyses": [
"bugün+Adverb"
],
"lemma": "bugün",
"pos": "Noun",
"tag": "Noun+A3sg+Pnon+Nom",
"dependency_head": 3,
"dependency_type": "advmod",
"dependency_prob": 0.9957938194274902
},
{
"index": 1,
"text": "hava",
"analyses": [
"hava+Noun+A3sg+Pnon+Nom"
],
"lemma": "hava",
"pos": "Noun",
"tag": "Noun+A3sg+Pnon+Nom",
"dependency_head": 3,
"dependency_type": "nsubj",
"dependency_prob": 0.9994691014289856
},
{
"index": 2,
"text": "çok",
"analyses": [
"çok+Adverb"
],
"lemma": "çok",
"pos": "Adverb",
"tag": "Adverb",
"dependency_head": 3,
"dependency_type": "advmod",
"dependency_prob": 0.9970120191574097
},
{
"index": 3,
"text": "güzel",
"analyses": [
"güzel+Adj"
],
"lemma": "güzel",
"pos": "Adj",
"tag": "Adj",
"dependency_head": -1,
"dependency_type": "root",
"dependency_prob": 0.9964820146560669
},
{
"index": 4,
"text": ".",
"analyses": [
".+Punct"
],
"lemma": ".",
"pos": "Punct",
"tag": "Punct",
"dependency_head": 3,
"dependency_type": "punct",
"dependency_prob": 0.9985439777374268
}
]
},
{
"text": "Zaten Ali söylemişti .",
"tokens": [
{
"index": 5,
"text": "Zaten",
"analyses": [
"zaten+Adverb"
],
"lemma": "zaten",
"pos": "Adverb",
"tag": "Adverb",
"dependency_head": 7,
"dependency_type": "advmod",
"dependency_prob": 0.9941402077674866
},
{
"index": 6,
"text": "Ali",
"analyses": [
"Ali+Noun+Prop+A3sg+Pnon+Nom"
],
"lemma": "Ali",
"pos": "Noun",
"tag": "Noun+Prop+A3sg+Pnon+Nom",
"dependency_head": 7,
"dependency_type": "nsubj",
"dependency_prob": 0.9987956285476685
},
{
"index": 7,
"text": "söylemişti",
"analyses": [
"söyle+Verb+Pos+Narr+Past+A3sg"
],
"lemma": "söyle",
"pos": "Verb",
"tag": "Verb+Pos+Narr+Past+A3sg",
"dependency_head": -1,
"dependency_type": "root",
"dependency_prob": 0.994990885257721
},
{
"index": 8,
"text": ".",
"analyses": [
".+Punct"
],
"lemma": ".",
"pos": "Punct",
"tag": "Punct",
"dependency_head": 7,
"dependency_type": "punct",
"dependency_prob": 0.9992388486862183
}
]
}
],
"quoted_sentences": [],
"entities": [
{
"text": "Bugün",
"start": 0,
"end": 0,
"type": "TIMEX3",
"score": null,
"attributes": {
"tid": "t1",
"type": "DATE",
"value": "2023-01-04"
}
},
{
"text": "Ali",
"start": 6,
"end": 6,
"type": "PERSON",
"score": 0.997,
"attributes": {
"normalized": "Ali"
}
}
]
}
After a successful Core Service call, a JSON is returned as a response. This JSON has some sections:
text
This is the normalized version of the text sent in the text parameter.
dct
Document creation date time sent in the dct parameter. All relative temporal expressions are normalized based on this value.
processing_time_in_seconds
Total processing time of the request in seconds.
token_count
Total number of tokens generated after core service analysis.
sentences
Sentences extracted by the sentence splitter. Each sentence element has the following fields:
text
Normalized version of the sentence.
tokens
Tokens constituting the sentence. Each token has the following structure:
| Field | Description |
|---|---|
| index | Global index of the token in the text. Starts from 0. |
| text | Token text |
| analyses | Morphological analysis results (the first element in the list is the disambiguated analisys in the context. |
| lemma | Lemma of the surface form |
| pos | Part-of-speech tag of the surface form (if derivations exist, the last POS of the derived form). |
| tag | Morphological tags |
| dependency_head | The head of the dependency relation |
| dependency_type | The head of the dependency relation |
| dependency_prob | The probability of the dependency relation |
sentence_sentiment
The sentiment analysis results for the sentence.
quoted_sentences
Sometimes, quoted sentences (sub-sentences) may exist in a sentence. These quoted sentences are denoted here by the start and end token indexes. Please remember that these indexes are global indexes among the text. Here is an example:
"quoted_sentences": [{ "start": 15, "end": 22 }]
entities
Captured named entities among the text are placed here. The common fields for every entity is given below:
| Field | Description |
|---|---|
| text | Running text of the captured named entity |
| start | Token index of the named entity beginning |
| end | Token index of the named entity end |
| type | Type of the entity |
| score | Recognition score (if available) |
| attributes | Entity type specific features dictionary |
For most of the entities, the attributes dictionary has normalized value of the entity.
The numeric entities has type and value attribute features. type feature may be either:
- numeric (123.45)
- text (yüz yirmi üç)
- mixed (50 milyon)
The value of the numeric expression is converted to real data type and represented in the value field of the attribute dictionary.
Temporal expressions are tagged by TIMEX3 entity type. Attributes dictionary of temporal entities have the following fields:
- tid: id of the temporal expression
- type: can be either DATE, TIME, DURATION, INTERVAL
- value:
More information about the temporal attributes can be found in http://timeml.org/site/publications/timeMLdocs/timeml_1.2.1.html
Errors
The Kittn API uses the following error codes:
| Error Code | Meaning |
|---|---|
| 400 | Bad Request -- Your request is invalid. |
| 401 | Unauthorized -- Your API key is wrong. |
| 403 | Forbidden -- The service requested is hidden for administrators only. |
| 404 | Not Found -- The specified service could not be found. |
| 405 | Method Not Allowed -- You tried to access a service with an invalid method (GET or POST). |
| 406 | Not Acceptable -- You requested a format that isn't json. |
| 410 | Gone -- The service requested has been removed from our servers. |
| 429 | Too Many Requests -- You have exceeded your API usage limits. |
| 500 | Internal Server Error -- We had a problem with our server. Try again later. |
| 503 | Service Unavailable -- We're temporarily offline for maintenance. Please try again later. |