And you can use meta data injection to figure out the lot. There are plenty of 
Pentaho examples floating around.  Will be great if someone could spend the 
time to replicar it in hop. DiegoSent from my Galaxy
-------- Original message --------From: Hans Van Akelyen 
<hans.van.akel...@gmail.com> Date: 3/12/22  12:08 am  (GMT+10:00) To: 
pod...@gmx.com, users@hop.apache.org Subject: Re: Any trick to read data from 
Excel file where no of columns is not known?  It’s not part of our distribution 
but there is a python transform [1]Cheers,Hans[1] 
https://github.com/m-a-hall/hop-cpython  On 2 December 2022 at 12:59:04, 
pod...@gmx.com (pod...@gmx.com) wrote: 

Thank you Matt,

 

that looks quite mosaic :-)

Maybe I'd try to use Python in the first stages. But there's no Python action 
in Hop :-(

There will be one day?



Regards

 

 



Sent: Thursday, December 01, 2022 at 4:28 PM

From: "Matt Casters" <matt.cast...@neo4j.com>

To: users@hop.apache.org

Subject: Re: Any trick to read data from Excel file where no of columns is not 
known?



A long time ago I did this in 'another' tool.

 

IIRC this is what's involved:

1) Scan the Excel files and determine the sheets, number of columns, their 
names and data types

1a) Sheets: leave the sheet name blank in the list, simply set start column/row 
to 0/0, include sheet name as an additional column in the output.

1b) Columns: set a few hundred unnamed columns, all strings, read 1 one row.  
The values are the names of the columns

1c) Data types: write to a CSV file and use the "File Metadata" transform to 
get the types

2) Inject this information into the Excel Input transform using ETL Metadata 
injection which also runs the pipeline.

 

Best of luck,

Matt

 

  



On Thu, Dec 1, 2022 at 3:12 PM <pod...@gmx.com[mailto:pod...@gmx.com]> 
wrote:Hello,



do we have some way to read data from Excel file where number of columns is 
unknown?

I mean sometimes file can be like:



column_1; column_2



but other time

column_1; column_2; column_3; column_3



Normally we need to define them in 'Fields' tab - possible not to do that in a 
fixed way?



Regards

 

  

 



 

 

 

 

 


Reply via email to