On Thursday 03 March 2016 12:47 AM, Benjamin Kim wrote:
I wonder if anyone has opened a SFTP connection to open a remote GZIP CSV file? I am able 
to download the file first locally using the SFTP Client in the spark-sftp package. Then, 
I load the file into a dataframe using the spark-csv package, which automatically 
decompresses the file. I just want to remove the "downloading file to local" 
step and directly have the remote file decompressed, read, and loaded. Can someone give 
me any hints?

One easy way on Linux, of course, is to use sshfs (https://github.com/libfuse/sshfs) and mount the remote directory locally. Since this uses FUSE, so works fine with normal user privileges.

Thanks,
Ben

Thanks

--
Sumedh Wale
SnappyData (http://www.snappydata.io)


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to