Home » Solutions » ETL DB Acceleration » File Transforms
Transform Huge Files Faster 
Smart Data Architects Use Flat Files and CoSort

Challenges:
Do you need to manipulate large volumes of data? Integrating, staging, and reporting large data volumes inside your database, ETL, or BI tool can take a very long time, and require costly software and hardware upgrades or appliances. Meanwhile, the overhead imposed during internal transformations impacts other users, ETL component runtimes, database query response times, and digital displays.

You may also need to directly manipulate or convert mainframe file formats, like Micro Focus ISAM or variable block, to and from desktop formats like record sequential, CSV, LDIF, and XML. Or, you may need to reformat and compare flat files for reporting purposes and/or what-if analyses.

Solutions:
Flat files are not only a convenient, common format for data exchange, but often the fastest way to change operational data and produce reports. If you use flat files, you should know that CoSort has long been considered one of the fastest data transformation tools for flat files available. Since 1992, CoSort has used a popular 4GL called SortCL to perform multiple data transformations in the same job script and I/O pass to manipulate large files in batch streams.

SortCL runs in the operating system or from within your programs to accelerate or replace existing applications or file development environments. In addition to being easier to code and maintain, SortCL improves runtime efficiency because it exploits file system I/O, memory and CPUs directly, and removes the heavy burden of multi-gigabyte file transformations from your database, ETL or BI tool.

In a single SortCL job script -- and I/O pass, you can:
• Input one or more large sequential data sources
• Run multiple transformations (filter, sort, etc.)
• Compare files to capture changes and BI
• Re-map, re-size, re-format, and pivot columns
• Create segmented, customized report targets
• Convert data types and file formats
• Protect sensitive data at the field level
• Generate safe test data in custom file formats
• Output to multiple targets simultaneously
In addition to runtime performance, coding these operations in SortCL can also be faster. SortCL uses a human-readable 4GL that leverages familiar data layout syntax, SQL manipulation concepts, and centralized metadata repositories. SortCL scripts all follow the same logical flow (input, process, output), and are usually much shorter than equivalent SQL procedures, Perl scripts, or programs written in C, COBOL, Java, VB, etc.

See also:
FAQ > Flat Files
FAQ > ETL
Solutions > Data Transformation
Solutions > Business Intelligence
Solutions > Field Protection
Solutions > File Interchange
Solutions > Test Data/Files
Products > CoSort > SortCL

make text smaller make text larger print this pageemail this page
» Resources

CoSort Tools in the
Data Warehouse
» Next Steps
1-800-333-SORT
1-321-777-8889
Did you find what you were looking for on this page?
YesNoUnsure

What you were looking for:

Include your email address if you would like a response.