DPS Lab – Professional Data and Text Analysis Tools

DPS Lab – Professional Data and Text Analysis Tools

DPS Lab

State-of-the-Art data and text analysis tools

Home Tools & Services StockPicking Lab

About DPS Lab

Pipeline

Own customized pipelines creation

Software is designed for creation of large text and data processing tasks composed from data imputation, preprocessing, processing and interpretation of outputs. You can use the results sharing feature for streamlining a team communication.
Combination

Unique combination of objects

Text datasets, Mongo DB, Postgres DB, External API, YouTube, Scrapers, Tickers, Factor groups, Models, Parquet DS, NLP tasks, Pre-Processing tasks, Data quality check, Financial features, Models training, Summarization, Sentiment, Charts.
Versatility

Versatility

The modules are prepared for the input of data from manifold areas, such as biology or meteorology and others characterized by stochastic behavior. The application is ready to integrate the several type of data and bring a solution in the desired domain.

How does DPS Lab work?

Inputs

The input object is files of functions for importing data of various formats.

The platform enables connection to public and private sector information systems, as well as applications in which data is published with a regular period and repetition.

The range of data used ranges from open public sector data (US Statements/EDI), to data downloaded by web crawlers (for example Yahoo Finance or Seeking Alpha), to paid APIs (for example RapidAPI), to private data that users upload to the platform themselves.

Text datasets
Node for textual dataset selection.
Mongo DB
Node for loading data from external Mongo database.
Postgres DB
Node for loading data from external PostgreSQL database.
External API
Node for loading data from external API.
Youtube
Node for loading data from Youtube videos.
Scraper
Node for loading data from Scraper.
Tickers
Node for Ticker selection.
Factor groups
Node for factor group selection.
Models
Node for model selection.
Parquet DS
Node for parquet dataset selection.

Preprocessing and Processing

The data collected by the platform is available to the data preprocessing and processing subject. The platform provides new data processing functions, such as summarization tools and sentiment identification tools.

The development of these functions used the latest knowledge in the fields of machine learning, deep learning, the development of artificial intelligence (AI), in general, the use of cognitive technologies.

Tokenize
Tokenize string input into array output. Posibility choose czech or english language.
Remove stop words
Basis stopwords. Posibility choose czech or english language.
Check data quality
The input can be a ticker (stock) or parquet file.
Lemmatize
Lemmatize text. Posibility choose czech or english language.
POS Tagger
Part-of-speech recognition. Posibility choose czech or english language.
Financial features
The input is a ticker (stock) and the selected factor group. According to the group, the performance factors for the selected ticker are calculated.
Drop columns
Drop selected columns.
Rename columns
Rename selected columns.
Train model
The input is the parguet file and the selected model.
Pre-Processing
Most common preprocessing techniques. Remove numbers, emoticons, accents, etc.
Summarize
Summarize given text.
Sentiment
Get sentiment for given text.

Results displaying and saving

The data processing object is followed by the data storage object (data repository).

Data can be exported from the platform into machine-readable formats, which enable further editing and data import into other systems. Communication with third-party systems or applications is ensured by both the export option and API output, which enables automated data transfer for cases of automated communication with other systems.

Console
Write data to the developer console in the browser and to the Log in the bottom right.
Data quality report
Clearly display report about data quality control.
Chart
Display data to chart.