The APEX_DATA_PARSER is a frequently used API which provides an easy interface to parse files from different file formats such as comma-separated-values (.csv), json, xml and xlsx. It was first introduced in APEX 19.1 version
It also helps us to analyze a file to understand its format and structure and identify the field names and datatypes of the fields.
In order for APEX_DATA_PARSER to parse a file, it must be in a BLOB format. We can make use of APEX_WEB_SERVICE API for this purpose.
The apex_web_service.make_rest_request_b is a PL/SQL function which invokes a RESTful style Web service and returns the results in a BLOB.
The parser is implemented as a table function - so the developer accesses parser results like a table. Therefore, the parser can utilize INSERT ... SELECT statements to insert rows directly into a table from the specified file.
In today's blog I am going to show a simple example of file parsing using APEX_DATA_PARSER API.
Step 1: I uploaded a simple .csv file into my Github repository. The link for the same is as follows :
Step 2: Created a table (name: API_DATA) to hold the .csv file data. Here I am using replica of EMP table to hold data from emp.csv file.
Step 3: The function supports parsing up to 300 columns
Code
insert into API_DATA
(EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO, BONUS)
select col001, col002, col003, col004, col005, col006, col007, col008, col009
from table
( apex_data_parser.parse
( p_content=> apex_web_service.make_rest_request_b('https://raw.githubusercontent.com/naveenaggarwal12/Emp-Data/main/emp.csv', 'GET')
, p_file_name=> 'emp.csv', p_skip_rows=> 1 )
);
This script uses a REST request to GET the table data in csv format from the github website, parses the csv using APEX_DATA_PARSER and then insert the data into the destination table API_DATA.
Description of the parameters used above:
P_CONTENT - the file content to be parsed as a BLOB
P_FILE_NAME - the name of the file; this used to derive the file type.
.xml extensions will be treated as XML files
.json, .geojson or .js extensions will be treated as JSON files
.csv or .txt extensions will be treated as delimited files
.xlsx extensions will be treated as Excel Workbooks (only the XLSX format is supported)
P_SKIP_ROWS - skip the first N rows before parsing the rest of the file.
There are many more parameter also which can be used if required.
Step4 : When we run the above script it inserts the data in the table.
APEX_DATA_PARSER also discovers information on columns and data types, which can be retrieved from a collection or directly using APEX_DATA_PARSER.DISCOVER and APEX_DATA_PARSER.GET_COLUMNS functions.
APEX_DATA_PARSER expects the file contents simply as a BLOB - it does not matter where this BLOB comes from. While we used a file upload in this example, the BLOB can also come from an existing table or from the web using the APEX_WEB_SERVICE or UTL_HTTP packages.
Go through the following link to read more about the APEX_DATA_PARSER
I hope you enjoyed reading my post.
Thanks for your time !!
Comments