Introduction
Dataload is a backbone of any e-Commerce application which is not only required to kick start the application such as setting up online catalog navigation but also to address the intermittent incoming data requests such as setting up the users and buyer accounts on the fly.
Dataload is a backbone of any e-Commerce application which is not only required to kick start the application such as setting up online catalog navigation but also to address the intermittent incoming data requests such as setting up the users and buyer accounts on the fly.
Whether its a dataload or data migration we always need the ultimate source of data to load into the WCS DB. Specific Chunk of WCS data can also be migrated either from old version existing WCS DB or from some other legacy data repository using following steps-
1. Data Extraction/Data Retrieval
2. Dataload Utility Customization (Optional)
3. Data Load using DataLoad Utility
4. Dataload Automation ( Optional )
1. Data Extraction/Data Retrieval
Depending upon the fact that either we are performing data migration or data load, we either need to extract the data from the source WCS DB or we need to retrieve the data from the ultimate source in the csv format that WCS understands.
Depending upon the fact that either we are performing data migration or data load, we either need to extract the data from the source WCS DB or we need to retrieve the data from the ultimate source in the csv format that WCS understands.
Data can be extracted from the source DB be it WCS DB or non WCS DB in the form of CSV using the SQL queries. We need to plan the data extraction in such a way so that the extract data once loaded into the target WCS DB drives the functionality concerned. e.g while migrating the Orders from the legacy system we need to ensure that we extract all the related data such as orderitems, users, user roles to ensure that the migrated orders become functional in new setup.
2. Dataload Utility Customization (Optional)
Dataload utility is a business object based customizable utility which is very scalable to address the requirements of the custom dataload including load for new custom tables/custom columns and also for injecting the business rules in the data load.
PFB the four pillars of dataload utility-
DataReader- Reads the data source e.g. CSVReader, XMLReader
BusinessObjectBuilder- Populates data object e.g. TableObjectBuilder, BaseBusinessObjectBuilder
BusinessObjectMediator- Transforms data object into physical objects e.g. PersonMediator
DataWriter- Saves the physical objects in DB using JDBC e.g. JDBCDataWriter
PFB the four pillars of dataload utility-
DataReader- Reads the data source e.g. CSVReader, XMLReader
BusinessObjectBuilder- Populates data object e.g. TableObjectBuilder, BaseBusinessObjectBuilder
BusinessObjectMediator- Transforms data object into physical objects e.g. PersonMediator
DataWriter- Saves the physical objects in DB using JDBC e.g. JDBCDataWriter
WCS does provide OOB component mediators such as for catalog, inventory, and price etc which can further be customized to suit the business dataload requirements. On the other hand WCS also provides the OOB TableObjectMediator which is very handy to load the load into single table directly by mapping the csv columns into the purposed custom table columns.
So the customization start point relies on the fact which business object we are aiming to load and what's the OOB support being provided in terms of OOB component mediator as a base.
PFB the OOB component mediators being provided at the moment-
Catalog/ Commerce Composer/ Content/ Member/ Price/ Promotion/ StoreConfiguration
2B. Loading Data Utilizing TableObjectMediator
Few OOB table dataload such as Orders and the extension tables data load can be done through the table to CSV mapping using TableObjectBuilder.
PFB few of the essentials being used within dataload-
IDResolver
IDResolver helps resolving the ID based upon the unique index of the table, it can also generate the new ID if the unique index data is absent in the table.
IDResolver helps resolving the ID based upon the unique index of the table, it can also generate the new ID if the unique index data is absent in the table.
<_config:Table name="ORDERS" excludeUnListedColumns="true" deleteKey="Delete" deleteValue="1">
<_config:Column name="ORGENTITY_ID" value="ORGENTITY_ID" valueFrom="IDResolve">
<_config:IDResolve tableName="ORGENTITY" generateNewKey="false" primaryKeyColumnName="ORGENTITY_ID">
<_config:UniqueIndexColumn name="LEGALID" value="orgLegalID" />
</_config:IDResolve>
</_config:Column>
<_config:IDResolve tableName="ORGENTITY" generateNewKey="false" primaryKeyColumnName="ORGENTITY_ID">
<_config:UniqueIndexColumn name="LEGALID" value="orgLegalID" />
</_config:IDResolve>
</_config:Column>
</_config:Table>
ColumnHandler
It allows to work and manipulate on the business rule to load column value through the dataload utility.
Business Context Service
It helps retrieving the contextual information such as storeID, catalogID and so on.
Business Context Service
It helps retrieving the contextual information such as storeID, catalogID and so on.
3. Data Load using DataLoad Utility
One we are done with our customization we can setup and validate the utility which compromises of below configuration files-
a. Environment configuration file ( wc-dataload-env.xml )
b. Load order configuration file ( wc-dataload.xml )
c. Business object configuration file ( wc-loader-<object>.xml )
a. Environment configuration file ( wc-dataload-env.xml )
b. Load order configuration file ( wc-dataload.xml )
c. Business object configuration file ( wc-loader-<object>.xml )
4. Dataload Automation (Optional)
Dataload can be automated on the server environments with the help of shell scripts so that there is no manual intervention required for execution.
- dataloadenv.sh can be used as a base for the automation
- custom mediators java files can be bundled into the jar files
- customs jars can be added into the classpath of the utility.
Dataload can be automated on the server environments with the help of shell scripts so that there is no manual intervention required for execution.
- dataloadenv.sh can be used as a base for the automation
- custom mediators java files can be bundled into the jar files
- customs jars can be added into the classpath of the utility.