Thank you Matt,
 
that looks quite mosaic :-)
Maybe I'd try to use Python in the first stages. But there's no Python action 
in Hop :-(
There will be one day?

Regards
 
 

Sent: Thursday, December 01, 2022 at 4:28 PM
From: "Matt Casters" <matt.cast...@neo4j.com>
To: users@hop.apache.org
Subject: Re: Any trick to read data from Excel file where no of columns is not 
known?

A long time ago I did this in 'another' tool.
 
IIRC this is what's involved:
1) Scan the Excel files and determine the sheets, number of columns, their 
names and data types
1a) Sheets: leave the sheet name blank in the list, simply set start column/row 
to 0/0, include sheet name as an additional column in the output.
1b) Columns: set a few hundred unnamed columns, all strings, read 1 one row.  
The values are the names of the columns
1c) Data types: write to a CSV file and use the "File Metadata" transform to 
get the types
2) Inject this information into the Excel Input transform using ETL Metadata 
injection which also runs the pipeline.
 
Best of luck,
Matt
 
  

On Thu, Dec 1, 2022 at 3:12 PM <pod...@gmx.com[mailto:pod...@gmx.com]> 
wrote:Hello,

do we have some way to read data from Excel file where number of columns is 
unknown?
I mean sometimes file can be like:

column_1; column_2

but other time
column_1; column_2; column_3; column_3

Normally we need to define them in 'Fields' tab - possible not to do that in a 
fixed way?

Regards
 
  
 

 
 
 
 
 

Reply via email to