Sumedh,
How would this work? The only server that we have is the Oozie server with no
resources to run anything except Oozie, and we have no sudo permissions. If we
run the mount command using the shell action which can run on any node of the
cluster via YARN, then the spark job will not be
On Thursday 03 March 2016 12:47 AM, Benjamin Kim wrote:
I wonder if anyone has opened a SFTP connection to open a remote GZIP CSV file? I am able
to download the file first locally using the SFTP Client in the spark-sftp package. Then,
I load the file into a dataframe using the spark-csv
The Apache Commons library will let you access files on an SFTP server via a
Java library, no local file handling involved
https://commons.apache.org/proper/commons-vfs/filesystems.html
Hope this helps,
Ewan
I wonder if anyone has opened a SFTP connection to open a remote GZIP CSV file?
I am
So doing a quick look through the README & code for spark-sftp it seems
that the way this connector works is by downloading the file locally on the
driver program and this is not configurable - so you would probably need to
find a different connector (and you probably shouldn't use spark-sftp for
I wonder if anyone has opened a SFTP connection to open a remote GZIP CSV file?
I am able to download the file first locally using the SFTP Client in the
spark-sftp package. Then, I load the file into a dataframe using the spark-csv
package, which automatically decompresses the file. I just