Skip to main content

EDQ Interview Questions-3

  1. Which processor you use to exclude the duplicate records ?
Firstly we need to identify the duplicates by using the “Duplicate check” processor providing the attributes on which you want list duplicates.
Take only the output records of this processor from “Non-Duplicated” port, thereby eliminating duplicates from the data stream.

  1. Which Processor is used to eliminate Duplicates ?
In order to eliminate duplicates, we can use “Group and Merge” processor, which in turn has 3 sub-processors i.e. Input, Group and Merge.
  • Add Attributes to Input Sub-processor to be considered in this data stream.
  • Add the Attribute(s) on which to eliminate the duplicate to the “Group” sub processor.
  • In the Merge Sub-process, select the relevant Merge function, by default its “Most Common Value”
Consider the Merged output results for the De-duplicated records.

  1. What is the difference between “Lookup and Return” and “Lookup Check” Processors ?
Lookup and Return, does the look up on the Reference data/Look up and gets back the return attrubte(s), which can be used to add as new attribute(s) or to update the existing columns in to data stream
Lookup Check, does the look up on the reference data/Look up to check if the attributes exists in reference data or not and does not bring back the return attributes, even though reference data is passing back.

  1. How to convert the format of the Date attribute to a different format ? For example MM/DD/YYYY HH:MM:SS to DD/MM/YY
If the Attribute which contains Date is of STRING data type then convert it to Date using “Convert Date to String” Processor and again use the processor “Convert String to Date” by providing the desired Output format in the “Options” of this processor.
If the Attribute which contains Date is of  DATE Data type then covert it to String by using the processor “Convert String to Date” by providing the desired Output format in the “Options” of this processor and if required you can convert it back to DATE.

  1. How to Add a unique Row-Identifier to each records in EDQ ?
To generate a unique Row-identifier you can use “Add Message Id” processor. It add a Number attribute which assigns a sequential number to each record.

Comments