Re: [S]FTP support as Pipeline I/O

2017-07-24 Thread Jean-Baptiste Onofré

Hi Lucas,

IMHO, it's not a IO, it's a filesystem that TextIO and others can support (like 
GFS or HDFS).


It's what we did in Camel: the ftp component is just an extend of file 
component.

It means that we would be able to do:

pipeline.apply(TextIO.from("ftp://...";)).

Thoughts ?

If agree, I would be happy to work on this (with any help ;)).

Regards
JB

On 07/23/2017 07:39 AM, Lucas Arruda wrote:

Hi Beam folks,

I would like to suggest the creation of a Pipeline I/O to support FTP/SFTP
as both source and sink locations for data processing. I've done some
research and it looks like there isn't any kind of development ongoing for
this (at least not on Jira).

I'd like to know your thoughts and if someone would like to help/support
this initiative. In case someone has started a thing already please let me
know ;)

Thank you,



--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Re: [S]FTP support as Pipeline I/O

2017-07-24 Thread Tolsa, Camille
Hello,

I would definitively appreciate this feature.
If i can help somehow tell me

Camille.

On 24 July 2017 at 09:31, Jean-Baptiste Onofré  wrote:

> Hi Lucas,
>
> IMHO, it's not a IO, it's a filesystem that TextIO and others can support
> (like GFS or HDFS).
>
> It's what we did in Camel: the ftp component is just an extend of file
> component.
>
> It means that we would be able to do:
>
> pipeline.apply(TextIO.from("ftp://...";)).
>
> Thoughts ?
>
> If agree, I would be happy to work on this (with any help ;)).
>
> Regards
> JB
>
>
> On 07/23/2017 07:39 AM, Lucas Arruda wrote:
>
>> Hi Beam folks,
>>
>> I would like to suggest the creation of a Pipeline I/O to support FTP/SFTP
>> as both source and sink locations for data processing. I've done some
>> research and it looks like there isn't any kind of development ongoing for
>> this (at least not on Jira).
>>
>> I'd like to know your thoughts and if someone would like to help/support
>> this initiative. In case someone has started a thing already please let me
>> know ;)
>>
>> Thank you,
>>
>>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

-- 


This e-mail transmission (message and any attached files) may contain 
information that is proprietary, privileged and/or confidential to Veolia 
Environnement and/or its affiliates and is intended exclusively for the 
person(s) to whom it is addressed. If you are not the intended recipient, 
please notify the sender by return e-mail and delete all copies of this 
e-mail, including all attachments. Unless expressly authorized, any use, 
disclosure, publication, retransmission or dissemination of this e-mail 
and/or of its attachments is strictly prohibited. 

Ce message electronique et ses fichiers attaches sont strictement 
confidentiels et peuvent contenir des elements dont Veolia Environnement 
et/ou l'une de ses entites affiliees sont proprietaires. Ils sont donc 
destines a l'usage de leurs seuls destinataires. Si vous avez recu ce 
message par erreur, merci de le retourner a son emetteur et de le detruire 
ainsi que toutes les pieces attachees. L'utilisation, la divulgation, la 
publication, la distribution, ou la reproduction non expressement 
autorisees de ce message et de ses pieces attachees sont interdites.



Re: [S]FTP support as Pipeline I/O

2017-07-24 Thread Reuven Lax
This would require writing data to local files in order to upload it to the
remote FTP, right?

On Mon, Jul 24, 2017 at 12:31 AM, Jean-Baptiste Onofré 
wrote:

> Hi Lucas,
>
> IMHO, it's not a IO, it's a filesystem that TextIO and others can support
> (like GFS or HDFS).
>
> It's what we did in Camel: the ftp component is just an extend of file
> component.
>
> It means that we would be able to do:
>
> pipeline.apply(TextIO.from("ftp://...";)).
>
> Thoughts ?
>
> If agree, I would be happy to work on this (with any help ;)).
>
> Regards
> JB
>
>
> On 07/23/2017 07:39 AM, Lucas Arruda wrote:
>
>> Hi Beam folks,
>>
>> I would like to suggest the creation of a Pipeline I/O to support FTP/SFTP
>> as both source and sink locations for data processing. I've done some
>> research and it looks like there isn't any kind of development ongoing for
>> this (at least not on Jira).
>>
>> I'd like to know your thoughts and if someone would like to help/support
>> this initiative. In case someone has started a thing already please let me
>> know ;)
>>
>> Thank you,
>>
>>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>


Re: [S]FTP support as Pipeline I/O

2017-07-24 Thread Tolsa, Camille
Not necessary with StringIO

On 24 July 2017 at 09:47, Reuven Lax  wrote:

> This would require writing data to local files in order to upload it to the
> remote FTP, right?
>
> On Mon, Jul 24, 2017 at 12:31 AM, Jean-Baptiste Onofré 
> wrote:
>
> > Hi Lucas,
> >
> > IMHO, it's not a IO, it's a filesystem that TextIO and others can support
> > (like GFS or HDFS).
> >
> > It's what we did in Camel: the ftp component is just an extend of file
> > component.
> >
> > It means that we would be able to do:
> >
> > pipeline.apply(TextIO.from("ftp://...";)).
> >
> > Thoughts ?
> >
> > If agree, I would be happy to work on this (with any help ;)).
> >
> > Regards
> > JB
> >
> >
> > On 07/23/2017 07:39 AM, Lucas Arruda wrote:
> >
> >> Hi Beam folks,
> >>
> >> I would like to suggest the creation of a Pipeline I/O to support
> FTP/SFTP
> >> as both source and sink locations for data processing. I've done some
> >> research and it looks like there isn't any kind of development ongoing
> for
> >> this (at least not on Jira).
> >>
> >> I'd like to know your thoughts and if someone would like to help/support
> >> this initiative. In case someone has started a thing already please let
> me
> >> know ;)
> >>
> >> Thank you,
> >>
> >>
> > --
> > Jean-Baptiste Onofré
> > jbono...@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
>

-- 


This e-mail transmission (message and any attached files) may contain 
information that is proprietary, privileged and/or confidential to Veolia 
Environnement and/or its affiliates and is intended exclusively for the 
person(s) to whom it is addressed. If you are not the intended recipient, 
please notify the sender by return e-mail and delete all copies of this 
e-mail, including all attachments. Unless expressly authorized, any use, 
disclosure, publication, retransmission or dissemination of this e-mail 
and/or of its attachments is strictly prohibited. 

Ce message electronique et ses fichiers attaches sont strictement 
confidentiels et peuvent contenir des elements dont Veolia Environnement 
et/ou l'une de ses entites affiliees sont proprietaires. Ils sont donc 
destines a l'usage de leurs seuls destinataires. Si vous avez recu ce 
message par erreur, merci de le retourner a son emetteur et de le detruire 
ainsi que toutes les pieces attachees. L'utilisation, la divulgation, la 
publication, la distribution, ou la reproduction non expressement 
autorisees de ce message et de ses pieces attachees sont interdites.



Re: [S]FTP support as Pipeline I/O

2017-07-24 Thread Eugene Kirpichov
What is StringIO?

On Mon, Jul 24, 2017 at 1:47 AM Tolsa, Camille 
wrote:

> Not necessary with StringIO
>
> On 24 July 2017 at 09:47, Reuven Lax  wrote:
>
> > This would require writing data to local files in order to upload it to
> the
> > remote FTP, right?
> >
> > On Mon, Jul 24, 2017 at 12:31 AM, Jean-Baptiste Onofré 
> > wrote:
> >
> > > Hi Lucas,
> > >
> > > IMHO, it's not a IO, it's a filesystem that TextIO and others can
> support
> > > (like GFS or HDFS).
> > >
> > > It's what we did in Camel: the ftp component is just an extend of file
> > > component.
> > >
> > > It means that we would be able to do:
> > >
> > > pipeline.apply(TextIO.from("ftp://...";)).
> > >
> > > Thoughts ?
> > >
> > > If agree, I would be happy to work on this (with any help ;)).
> > >
> > > Regards
> > > JB
> > >
> > >
> > > On 07/23/2017 07:39 AM, Lucas Arruda wrote:
> > >
> > >> Hi Beam folks,
> > >>
> > >> I would like to suggest the creation of a Pipeline I/O to support
> > FTP/SFTP
> > >> as both source and sink locations for data processing. I've done some
> > >> research and it looks like there isn't any kind of development ongoing
> > for
> > >> this (at least not on Jira).
> > >>
> > >> I'd like to know your thoughts and if someone would like to
> help/support
> > >> this initiative. In case someone has started a thing already please
> let
> > me
> > >> know ;)
> > >>
> > >> Thank you,
> > >>
> > >>
> > > --
> > > Jean-Baptiste Onofré
> > > jbono...@apache.org
> > > http://blog.nanthrax.net
> > > Talend - http://www.talend.com
> > >
> >
>
> --
>
>
> 
> This e-mail transmission (message and any attached files) may contain
> information that is proprietary, privileged and/or confidential to Veolia
> Environnement and/or its affiliates and is intended exclusively for the
> person(s) to whom it is addressed. If you are not the intended recipient,
> please notify the sender by return e-mail and delete all copies of this
> e-mail, including all attachments. Unless expressly authorized, any use,
> disclosure, publication, retransmission or dissemination of this e-mail
> and/or of its attachments is strictly prohibited.
>
> Ce message electronique et ses fichiers attaches sont strictement
> confidentiels et peuvent contenir des elements dont Veolia Environnement
> et/ou l'une de ses entites affiliees sont proprietaires. Ils sont donc
> destines a l'usage de leurs seuls destinataires. Si vous avez recu ce
> message par erreur, merci de le retourner a son emetteur et de le detruire
> ainsi que toutes les pieces attachees. L'utilisation, la divulgation, la
> publication, la distribution, ou la reproduction non expressement
> autorisees de ce message et de ses pieces attachees sont interdites.
>
> 
>


Re: [S]FTP support as Pipeline I/O

2017-07-24 Thread Jean-Baptiste Onofré
I guess TextIO ? ;)

Regards
JB

On Jul 24, 2017, 21:27, at 21:27, Eugene Kirpichov 
 wrote:
>What is StringIO?
>
>On Mon, Jul 24, 2017 at 1:47 AM Tolsa, Camille
>
>wrote:
>
>> Not necessary with StringIO
>>
>> On 24 July 2017 at 09:47, Reuven Lax 
>wrote:
>>
>> > This would require writing data to local files in order to upload
>it to
>> the
>> > remote FTP, right?
>> >
>> > On Mon, Jul 24, 2017 at 12:31 AM, Jean-Baptiste Onofré
>
>> > wrote:
>> >
>> > > Hi Lucas,
>> > >
>> > > IMHO, it's not a IO, it's a filesystem that TextIO and others can
>> support
>> > > (like GFS or HDFS).
>> > >
>> > > It's what we did in Camel: the ftp component is just an extend of
>file
>> > > component.
>> > >
>> > > It means that we would be able to do:
>> > >
>> > > pipeline.apply(TextIO.from("ftp://...";)).
>> > >
>> > > Thoughts ?
>> > >
>> > > If agree, I would be happy to work on this (with any help ;)).
>> > >
>> > > Regards
>> > > JB
>> > >
>> > >
>> > > On 07/23/2017 07:39 AM, Lucas Arruda wrote:
>> > >
>> > >> Hi Beam folks,
>> > >>
>> > >> I would like to suggest the creation of a Pipeline I/O to
>support
>> > FTP/SFTP
>> > >> as both source and sink locations for data processing. I've done
>some
>> > >> research and it looks like there isn't any kind of development
>ongoing
>> > for
>> > >> this (at least not on Jira).
>> > >>
>> > >> I'd like to know your thoughts and if someone would like to
>> help/support
>> > >> this initiative. In case someone has started a thing already
>please
>> let
>> > me
>> > >> know ;)
>> > >>
>> > >> Thank you,
>> > >>
>> > >>
>> > > --
>> > > Jean-Baptiste Onofré
>> > > jbono...@apache.org
>> > > http://blog.nanthrax.net
>> > > Talend - http://www.talend.com
>> > >
>> >
>>
>> --
>>
>>
>>
>
>> This e-mail transmission (message and any attached files) may contain
>> information that is proprietary, privileged and/or confidential to
>Veolia
>> Environnement and/or its affiliates and is intended exclusively for
>the
>> person(s) to whom it is addressed. If you are not the intended
>recipient,
>> please notify the sender by return e-mail and delete all copies of
>this
>> e-mail, including all attachments. Unless expressly authorized, any
>use,
>> disclosure, publication, retransmission or dissemination of this
>e-mail
>> and/or of its attachments is strictly prohibited.
>>
>> Ce message electronique et ses fichiers attaches sont strictement
>> confidentiels et peuvent contenir des elements dont Veolia
>Environnement
>> et/ou l'une de ses entites affiliees sont proprietaires. Ils sont
>donc
>> destines a l'usage de leurs seuls destinataires. Si vous avez recu ce
>> message par erreur, merci de le retourner a son emetteur et de le
>detruire
>> ainsi que toutes les pieces attachees. L'utilisation, la divulgation,
>la
>> publication, la distribution, ou la reproduction non expressement
>> autorisees de ce message et de ses pieces attachees sont interdites.
>>
>>
>
>>


Re: [S]FTP support as Pipeline I/O

2017-07-24 Thread Eugene Kirpichov
I think Camille may have referred to python standard library class StringIO
which means collecting the output into a string - and then I suppose
uploading the string to FTP. That could work (similar stuff exists in Java
library) but would limit us to files whose content fits in memory.

On Mon, Jul 24, 2017, 12:31 PM Jean-Baptiste Onofré  wrote:

> I guess TextIO ? ;)
>
> Regards
> JB
>
> On Jul 24, 2017, 21:27, at 21:27, Eugene Kirpichov
>  wrote:
> >What is StringIO?
> >
> >On Mon, Jul 24, 2017 at 1:47 AM Tolsa, Camille
> >
> >wrote:
> >
> >> Not necessary with StringIO
> >>
> >> On 24 July 2017 at 09:47, Reuven Lax 
> >wrote:
> >>
> >> > This would require writing data to local files in order to upload
> >it to
> >> the
> >> > remote FTP, right?
> >> >
> >> > On Mon, Jul 24, 2017 at 12:31 AM, Jean-Baptiste Onofré
> >
> >> > wrote:
> >> >
> >> > > Hi Lucas,
> >> > >
> >> > > IMHO, it's not a IO, it's a filesystem that TextIO and others can
> >> support
> >> > > (like GFS or HDFS).
> >> > >
> >> > > It's what we did in Camel: the ftp component is just an extend of
> >file
> >> > > component.
> >> > >
> >> > > It means that we would be able to do:
> >> > >
> >> > > pipeline.apply(TextIO.from("ftp://...";)).
> >> > >
> >> > > Thoughts ?
> >> > >
> >> > > If agree, I would be happy to work on this (with any help ;)).
> >> > >
> >> > > Regards
> >> > > JB
> >> > >
> >> > >
> >> > > On 07/23/2017 07:39 AM, Lucas Arruda wrote:
> >> > >
> >> > >> Hi Beam folks,
> >> > >>
> >> > >> I would like to suggest the creation of a Pipeline I/O to
> >support
> >> > FTP/SFTP
> >> > >> as both source and sink locations for data processing. I've done
> >some
> >> > >> research and it looks like there isn't any kind of development
> >ongoing
> >> > for
> >> > >> this (at least not on Jira).
> >> > >>
> >> > >> I'd like to know your thoughts and if someone would like to
> >> help/support
> >> > >> this initiative. In case someone has started a thing already
> >please
> >> let
> >> > me
> >> > >> know ;)
> >> > >>
> >> > >> Thank you,
> >> > >>
> >> > >>
> >> > > --
> >> > > Jean-Baptiste Onofré
> >> > > jbono...@apache.org
> >> > > http://blog.nanthrax.net
> >> > > Talend - http://www.talend.com
> >> > >
> >> >
> >>
> >> --
> >>
> >>
> >>
>
> >
> >> This e-mail transmission (message and any attached files) may contain
> >> information that is proprietary, privileged and/or confidential to
> >Veolia
> >> Environnement and/or its affiliates and is intended exclusively for
> >the
> >> person(s) to whom it is addressed. If you are not the intended
> >recipient,
> >> please notify the sender by return e-mail and delete all copies of
> >this
> >> e-mail, including all attachments. Unless expressly authorized, any
> >use,
> >> disclosure, publication, retransmission or dissemination of this
> >e-mail
> >> and/or of its attachments is strictly prohibited.
> >>
> >> Ce message electronique et ses fichiers attaches sont strictement
> >> confidentiels et peuvent contenir des elements dont Veolia
> >Environnement
> >> et/ou l'une de ses entites affiliees sont proprietaires. Ils sont
> >donc
> >> destines a l'usage de leurs seuls destinataires. Si vous avez recu ce
> >> message par erreur, merci de le retourner a son emetteur et de le
> >detruire
> >> ainsi que toutes les pieces attachees. L'utilisation, la divulgation,
> >la
> >> publication, la distribution, ou la reproduction non expressement
> >> autorisees de ce message et de ses pieces attachees sont interdites.
> >>
> >>
>
> >
> >>
>


Re: [S]FTP support as Pipeline I/O

2017-07-24 Thread Jean-Baptiste Onofré
In Camel, we have different mode: with local file caching or using streaming 
when possible (it depends of the body in the Exchange).


So, I think we can do the same in Beam.

Regards
JB

On 07/24/2017 09:38 PM, Eugene Kirpichov wrote:

I think Camille may have referred to python standard library class StringIO
which means collecting the output into a string - and then I suppose
uploading the string to FTP. That could work (similar stuff exists in Java
library) but would limit us to files whose content fits in memory.

On Mon, Jul 24, 2017, 12:31 PM Jean-Baptiste Onofré  wrote:


I guess TextIO ? ;)

Regards
JB

On Jul 24, 2017, 21:27, at 21:27, Eugene Kirpichov
 wrote:

What is StringIO?

On Mon, Jul 24, 2017 at 1:47 AM Tolsa, Camille

wrote:


Not necessary with StringIO

On 24 July 2017 at 09:47, Reuven Lax 

wrote:



This would require writing data to local files in order to upload

it to

the

remote FTP, right?

On Mon, Jul 24, 2017 at 12:31 AM, Jean-Baptiste Onofré



wrote:


Hi Lucas,

IMHO, it's not a IO, it's a filesystem that TextIO and others can

support

(like GFS or HDFS).

It's what we did in Camel: the ftp component is just an extend of

file

component.

It means that we would be able to do:

pipeline.apply(TextIO.from("ftp://...";)).

Thoughts ?

If agree, I would be happy to work on this (with any help ;)).

Regards
JB


On 07/23/2017 07:39 AM, Lucas Arruda wrote:


Hi Beam folks,

I would like to suggest the creation of a Pipeline I/O to

support

FTP/SFTP

as both source and sink locations for data processing. I've done

some

research and it looks like there isn't any kind of development

ongoing

for

this (at least not on Jira).

I'd like to know your thoughts and if someone would like to

help/support

this initiative. In case someone has started a thing already

please

let

me

know ;)

Thank you,



--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com





--








This e-mail transmission (message and any attached files) may contain
information that is proprietary, privileged and/or confidential to

Veolia

Environnement and/or its affiliates and is intended exclusively for

the

person(s) to whom it is addressed. If you are not the intended

recipient,

please notify the sender by return e-mail and delete all copies of

this

e-mail, including all attachments. Unless expressly authorized, any

use,

disclosure, publication, retransmission or dissemination of this

e-mail

and/or of its attachments is strictly prohibited.

Ce message electronique et ses fichiers attaches sont strictement
confidentiels et peuvent contenir des elements dont Veolia

Environnement

et/ou l'une de ses entites affiliees sont proprietaires. Ils sont

donc

destines a l'usage de leurs seuls destinataires. Si vous avez recu ce
message par erreur, merci de le retourner a son emetteur et de le

detruire

ainsi que toutes les pieces attachees. L'utilisation, la divulgation,

la

publication, la distribution, ou la reproduction non expressement
autorisees de ce message et de ses pieces attachees sont interdites.














--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com