I described the steps required to automatically read a workbook without
resorting to absolute guessing, no more and no less.  The way it's
implemented, the language or API doesn't change any of it IMO.
That being said, the source is an excel file so guessing is always going to
be involved.
Please be careful with data sources like that.

Good luck,
Matt

On Fri, Dec 2, 2022 at 12:59 PM <pod...@gmx.com> wrote:

>
> Thank you Matt,
>
> that looks quite mosaic :-)
> Maybe I'd try to use Python in the first stages. But there's no Python
> action in Hop :-(
> There will be one day?
>
> Regards
>
>
>
> Sent: Thursday, December 01, 2022 at 4:28 PM
> From: "Matt Casters" <matt.cast...@neo4j.com>
> To: users@hop.apache.org
> Subject: Re: Any trick to read data from Excel file where no of columns is
> not known?
>
> A long time ago I did this in 'another' tool.
>
> IIRC this is what's involved:
> 1) Scan the Excel files and determine the sheets, number of columns, their
> names and data types
> 1a) Sheets: leave the sheet name blank in the list, simply set start
> column/row to 0/0, include sheet name as an additional column in the output.
> 1b) Columns: set a few hundred unnamed columns, all strings, read 1 one
> row.  The values are the names of the columns
> 1c) Data types: write to a CSV file and use the "File Metadata" transform
> to get the types
> 2) Inject this information into the Excel Input transform using ETL
> Metadata injection which also runs the pipeline.
>
> Best of luck,
> Matt
>
>
>
> On Thu, Dec 1, 2022 at 3:12 PM <pod...@gmx.com[mailto:pod...@gmx.com]>
> wrote:Hello,
>
> do we have some way to read data from Excel file where number of columns
> is unknown?
> I mean sometimes file can be like:
>
> column_1; column_2
>
> but other time
> column_1; column_2; column_3; column_3
>
> Normally we need to define them in 'Fields' tab - possible not to do that
> in a fixed way?
>
> Regards
>
>
>
>
>
>
>
>
>
>


-- 
Neo4j Chief Solutions Architect
*✉   *matt.cast...@neo4j.com

Reply via email to