Hi Takeshi,
Thank you very much.
Regards,
Chanh
On Thu, Jun 8, 2017 at 11:05 PM Takeshi Yamamuro
wrote:
> I filed a jira about this issue:
> https://issues.apache.org/jira/browse/SPARK-21024
>
> On Thu, Jun 8, 2017 at 1:27 AM, Chanh Le wrote:
>
>>
I filed a jira about this issue:
https://issues.apache.org/jira/browse/SPARK-21024
On Thu, Jun 8, 2017 at 1:27 AM, Chanh Le wrote:
> Can you recommend one?
>
> Thanks.
>
> On Thu, Jun 8, 2017 at 2:47 PM Jörn Franke wrote:
>
>> You can change the CSV
Can you recommend one?
Thanks.
On Thu, Jun 8, 2017 at 2:47 PM Jörn Franke wrote:
> You can change the CSV parser library
>
> On 8. Jun 2017, at 08:35, Chanh Le wrote:
>
>
> I did add mode -> DROPMALFORMED but it still couldn't ignore it because
> the
You can change the CSV parser library
> On 8. Jun 2017, at 08:35, Chanh Le wrote:
>
>
> I did add mode -> DROPMALFORMED but it still couldn't ignore it because the
> error raise from the CSV library that Spark are using.
>
>
>> On Thu, Jun 8, 2017 at 12:11 PM Jörn
I did add mode -> DROPMALFORMED but it still couldn't ignore it because the
error raise from the CSV library that Spark are using.
On Thu, Jun 8, 2017 at 12:11 PM Jörn Franke wrote:
> The CSV data source allows you to skip invalid lines - this should also
> include lines
The CSV data source allows you to skip invalid lines - this should also include
lines that have more than maxColumns. Choose mode "DROPMALFORMED"
> On 8. Jun 2017, at 03:04, Chanh Le wrote:
>
> Hi Takeshi, Jörn Franke,
>
> The problem is even I increase the maxColumns it
Hi Takeshi, Jörn Franke,
The problem is even I increase the maxColumns it still have some lines have
larger columns than the one I set and it will cost a lot of memory.
So I just wanna skip the line has larger columns than the maxColumns I set.
Regards,
Chanh
On Thu, Jun 8, 2017 at 12:48 AM
Is it not enough to set `maxColumns` in CSV options?
https://github.com/apache/spark/blob/branch-2.1/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVOptions.scala#L116
// maropu
On Wed, Jun 7, 2017 at 9:45 AM, Jörn Franke wrote:
> Spark CSV data
Spark CSV data source should be able
> On 7. Jun 2017, at 17:50, Chanh Le wrote:
>
> Hi everyone,
> I am using Spark 2.1.1 to read csv files and convert to avro files.
> One problem that I am facing is if one row of csv file has more columns than
> maxColumns (default is
Hi everyone,
I am using Spark 2.1.1 to read csv files and convert to avro files.
One problem that I am facing is if one row of csv file has more columns
than maxColumns (default is 20480). The process of parsing was stop.
Internal state when error was thrown: line=1, column=3, record=0,
10 matches
Mail list logo