Re: [I] [Feature Request]: Build ETL/ELT alternative (hop)

2026-04-06 Thread via GitHub


mattcasters commented on issue #6907:
URL: https://github.com/apache/hop/issues/6907#issuecomment-4193221044

   We're always allowed to brainstorm.  What you describe is already being done 
in Hop by the Beam implementation, see the 
`HopPipelineMetaToBeamPipelineConverter`.  Granted, it's easier since Beam and 
Hop are alike but for sure it can be done.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [I] [Feature Request]: Build ETL/ELT alternative (hop)

2026-04-06 Thread via GitHub


mhamedbenjmaa commented on issue #6907:
URL: https://github.com/apache/hop/issues/6907#issuecomment-4193188343

   Are we allowed to brainstorm here ? if yes , I agree with @hansva there is 
no need to switch to python or use it, java is more than enough specially that 
all connector and stuff are ready, just read the XML of the pipeline and 
generate SQL 
   
   Apache calcite, is already there and ready to take standard SQL and adapt it 
for any vendo, 
   
   only thing left to do are 
   
   1) validate that the pipeline is ELT ready ( no funny stuff , no multiple 
sources , no CSV destination , there is no unsupported functions etc...)
   2) Generate standard SQL 
   3) adapted to destination vendor (using calsite , or any other ready to use 
java stuff)
   4) push it to destination 
   
   we can start simple 
   
   let say we only support simple pipe like select from , join , copy, 
merge(union) insert into ,  and add more stuff later step by step 
   
   how about that ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [I] [Feature Request]: Build ETL/ELT alternative (hop)

2026-04-06 Thread via GitHub


mattcasters commented on issue #6907:
URL: https://github.com/apache/hop/issues/6907#issuecomment-4193112085

   I wouln't mind adding more support for `dbt` though.  I'm not sure how that 
would look like though.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [I] [Feature Request]: Build ETL/ELT alternative (hop)

2026-04-06 Thread via GitHub


hansva commented on issue #6907:
URL: https://github.com/apache/hop/issues/6907#issuecomment-4192769975

   That would allow you to construct a pipeline, but won't allow you to create 
a plugin. I think multi language pipelines are not something on our roadmap 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [I] [Feature Request]: Build ETL/ELT alternative (hop)

2026-04-06 Thread via GitHub


mattcasters commented on issue #6907:
URL: https://github.com/apache/hop/issues/6907#issuecomment-4192422004

   @CarlosJuncher03 Not yet.  PyHop is work in progress:  
https://github.com/mattcasters/hop/blob/8fec313419c65873c61f2d7c20f8ea3a043a34a3/docs/hop-user-manual/modules/ROOT/pages/hop-tools/hop-python/hop-python.adoc


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [I] [Feature Request]: Build ETL/ELT alternative (hop)

2026-04-06 Thread via GitHub


CarlosJuncher03 commented on issue #6907:
URL: https://github.com/apache/hop/issues/6907#issuecomment-4192306553

   Currently, there isn't an SDK available for me to develop a Python plugin 
for Apache Hop, right? Only Java?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [I] [Feature Request]: Build ETL/ELT alternative (hop)

2026-04-05 Thread via GitHub


mattcasters commented on issue #6907:
URL: https://github.com/apache/hop/issues/6907#issuecomment-4190567431

   These are all great ideas. Perhaps someone will write the code for it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [I] [Feature Request]: Build ETL/ELT alternative (hop)

2026-04-05 Thread via GitHub


CarlosJuncher03 commented on issue #6907:
URL: https://github.com/apache/hop/issues/6907#issuecomment-4189857988

   One idea would be to develop a plugin that reads the pipeline XML and 
generates the SQL for the transformation types. I don't know if it's the 
easiest way, but it would be an idea.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [I] [Feature Request]: Build ETL/ELT alternative (hop)

2026-04-03 Thread via GitHub


mhamedbenjmaa commented on issue #6907:
URL: https://github.com/apache/hop/issues/6907#issuecomment-4186332043

   I'm just suggesting, I would love to participate also 
   
   @hansva actually under the hood , most of the vendors when they apply this 
technic, they will not generate SQL directly, they will generate dbt code , 
more reliable and technology agnostic
   
   again it can be super handy when we deal with cloud stuff and we want to 
avoid traffic cost. Datastage for instance is top gartner for years now, they 
do not implement stuff with no reason
   
   So if you like the idea I'll be happy to discuss and contribute 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [I] [Feature Request]: Build ETL/ELT alternative (hop)

2026-04-03 Thread via GitHub


mhamedbenjmaa commented on issue #6907:
URL: https://github.com/apache/hop/issues/6907#issuecomment-4185141755

   Actually this is a killing feature it remove the need the think about dbt of 
stuff like that ,here how its suppose to work:
   
   lets assume a simple job  input T1 --> derivation A*B --> sort by C -- 
FIlter (remove null on D) --> destination T2
   
   you run the job by default ETL mode, all the suff is happening in the engine 
and then insert on T2  so far so good 
   
   now we want to add ELT  mode (for any reason , we are using snowflake , or 
we want to avoid igress or whatever reason) , if we check the box 'Run this job 
in ELT mode) apache hop will analyse the job components and instead of working 
as ETL it will generate this SQL
   
   Inset into T2 
   Select A,B,C,D,A*B as derivation from T1 where D is not null sort by C asc  
   and push this SQL at the destination , BINGO we are in ELT Mode now , no 
engine, no data movement no nothing 
   
   Major ELT vendor are proposing this feature now here an example 
   
   
https://dataplatform.cloud.ibm.com/docs/content/dstage/dsnav/topics/elt-mode.html?context=cpdaas
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [I] [Feature Request]: Build ETL/ELT alternative (hop)

2026-04-03 Thread via GitHub


mattcasters commented on issue #6907:
URL: https://github.com/apache/hop/issues/6907#issuecomment-4185335003

   Most of these databases are CPU bound these days and in general always have 
been in practice.  This is mainly because of database licensing per core.  I 
have seen ETL operations being faster in a database, but in general it's simply 
not true.  It's a fairytale told by the likes of Oracle to sell more expensive 
contracts. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [I] [Feature Request]: Build ETL/ELT alternative (hop)

2026-04-03 Thread via GitHub


hansva commented on issue #6907:
URL: https://github.com/apache/hop/issues/6907#issuecomment-4185306408

   It would be another engine, I like the idea but someone would have to write 
the engine. Another question would be if we could write all that translation in 
ANSI sql or if it would be different engine types for snowflake, postgresql,...
   
   So the main question is, is this something you are willing to work 
on/develop or is this something for the idea box until a developer shows up 
that wants to do the job.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [I] [Feature Request]: Build ETL/ELT alternative (hop)

2026-04-02 Thread via GitHub


mattcasters commented on issue #6907:
URL: https://github.com/apache/hop/issues/6907#issuecomment-4180795100

   I must be missing what exactly you're suggesting.  ELT in the context of Hop 
would be running a pipeline in Apache Spark, as an example.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]