GitHub user dave-csc added a comment to the discussion: How to create complex 
XML structures?

For those looking for a very clumsy and complex workaround, here's how to do 
it. You first need a sub-pipeline with this layout:
![immagine](https://github.com/user-attachments/assets/415c6e9a-5587-4631-b8ea-d9340150c5f4)
- **Get rows from result** contains the stream in the parent pipeline with the 
XML fragments for the single rows
- **Get variables** needs to read two parameters passed from the parent: the 
`GROUP_ID` and the `XML_TEMPLATE`
- **XML Join** should be configured as follow:
  - _Target XML transforms_ = Get variables
  - _Target XML field_ = the field in which you read `XML_TEMPLATE`
  - _Source XML transform_ = Get rows from result
  - _Source XML field_ = ideally, you should pass it through a variable in 
order to make the pipeline re-usable. In practice, this doesn't work and you 
need to specify the field name as it is passed from the parent pipeline (see 
#5396)
  - _XPath statement_ = this field also should be passed from above and doesn't 
work (same #5396). So specify directly the "merge point" in XPath here
  - _Result XML field_ = give a name of your choice, e.g. `xml_result`

The parent pipeline should process the data stream as follows:
- use a **Sort rows** to order data by the grouping keys you need
- link an output to an **Add XML** to generate the XML for each single row
- link another output from Sort rows to a **Group by** transform, and specify 
only the grouping fields. Then link this to an **Add sequence** in order to 
give each group a distinct number (give it a name, e.g. `group_id`)
- link Add XML and Add sequence to a **Merge join**, using the same grouping 
keys as merging keys
- link this Merge join to the **Pipeline Executor** of the sub-pipeline above, 
configure it as follows:
  - _Parameters_: pass the field `group_id` to `GROUP_ID`, and the 
`XML_TEMPLATE` (as a constant String, or from another transform before, etc.)
  - _Row grouping_: remove any value in _Number of rows to send to pipeline_ 
and specify _Field to group rows on_ = `group_id`
- link the _result rows_ output of Pipeline Executor, and the output of Add 
sequence above to another **Merge join**: this is needed to reconcile the 
original grouped data with the generated XML (the XML Join above strips any 
field that is not in the Get variables output)

Done... at the second Merge Join's output you should have the grouping keys and 
the XML related to them!

GitHub link: 
https://github.com/apache/hop/discussions/5395#discussioncomment-13410277

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to