I think you could also try saveAsHadoopFile with a custom output format
like
https://github.com/amutu/tdw/blob/master/qe/contrib/src/java/org/apache/hadoop/hive/contrib/fileformat/protobuf/mapred/ProtobufOutputFormat.java
On Thu, 16 Jan 2020 at 09:34, Duan,Bing wrote:
> Hi all:
>
> I read
ch would be to either implement a custom
datasource.
From: "Duan,Bing" mailto:duanb...@baidu.com>>
Date: Thursday, January 16, 2020 at 12:35 AM
To: "dev@spark.apache.org<mailto:dev@spark.apache.org>"
mailto:dev@spark.apache.org>>
Subject: How to impl
the source
> val writer = FileSystem.*get*(null).create(new Path("s3://..."))
> bytes.foreach(b => writer.write(b))
> writer.close()
> })
>
>
>
> The more complicated but pretty approach would be to either implement a
> custom datasource.
>
>
>
> *Fro
quot;Duan,Bing"
Date: Thursday, January 16, 2020 at 12:35 AM
To: "dev@spark.apache.org"
Subject: How to implement a "saveAsBinaryFile" function?
Hi all:
I read binary data(protobuf format) from filesystem by binaryFiles function to
a RDD[Array[Byte]] it wo
Hi Bing,
Good question and the answer is; it depends on what your use-case is.
If you really just want to write raw bytes, then you could create a
.foreach where you open an OutputStream and write it to some file. But this
is probably not what you want, and in practice not very handy since you
Hi all:
I read binary data(protobuf format) from filesystem by binaryFiles function to
a RDD[Array[Byte]] it works fine. But when I save the it to filesystem by
saveAsTextFile, the quotation mark was be escaped like this:
"\"20192_1\"",1,24,0,2,"\"S66.000x001\””,which should be