doopConfWithOptions(relation.options))
> )
>
> *import *scala.collection.JavaConverters._
>
> *val *rows = readFile(pFile).flatMap(_ *match *{
> *case *r: InternalRow => *Seq*(r)
>
> // This doesn't work. vector mode is doing something screwy
> *case *b: ColumnarBatch => b.rowIterator().asScala
> }).toList
>
> *println*(rows)
> //List([0,1,5b,24,66647361])
> //??this is wrong I think
>
>
>
> Has anyone attempted something similar?
>
>
>
> Cheers Andrew
>
>
>
--
Ryan Blue
Software Engineer
Netflix
get(0,
> DataTypes.DateType));
>
> }
>
> It prints an integer as output:
>
> MyDataWriter.write: 17039
>
>
> Is this a bug? or I am doing something wrong?
>
> Thanks,
> Shubham
>
--
Ryan Blue
Software Engineer
Netflix
elson, Assaf
>>> wrote:
>>>
>>> Could you add a fuller code example? I tried to reproduce it in my
>>> environment and I am getting just one instance of the reader…
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Assaf
>
or shouldn't
> come. Let me know if this understanding is correct
>
> On Tue, May 1, 2018 at 9:37 PM, Ryan Blue <rb...@netflix.com> wrote:
>
>> This is usually caused by skew. Sometimes you can work around it by in
>> creasing the number of partitions like you tri
kFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:419)
> at
> org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:349)
>
>
--
Ryan Blue
Software Engineer
Netflix
.
>> memoryOverhead.
>>
>> Driver memory=4g, executor mem=12g, num-executors=8, executor core=8
>>
>> Do you think below setting can help me to overcome above issue:
>>
>> spark.default.parellism=1000
>> spark.sql.shuffle.partitions=1000
>>
>> Because default max number of partitions are 1000.
>>
>>
>>
>
--
Ryan Blue
Software Engineer
Netflix
for a stage. In that version, you probably want to set
spark.blacklist.task.maxTaskAttemptsPerExecutor. See the settings docs
<http://spark.apache.org/docs/latest/configuration.html> and search for
“blacklist” to see all the options.
rb
On Mon, Apr 24, 2017 at 9:41 AM, Ryan Blue <rb...@netflix.c
>
>
> Regards
> Sumit Chawla
>
>
--
Ryan Blue
Software Engineer
Netflix
progress"
> java.lang.OutOfMemoryError: Java heap space at
> java.util.Arrays.copyOfRange(Arrays.java:3664) at
> java.lang.String.(String.java:207) at
> java.lang.StringBuilder.toString(StringBuilder.java:407) at
> scala.collection.mutable.StringBuilder.toString(StringBuilder.scala:430)
> at org.apache.spark.ui.ConsoleProgressBar.show(ConsoleProgressBar.scala:101)
> at
> org.apache.spark.ui.ConsoleProgressBar.org$apache$spark$ui$ConsoleProgressBar$$refresh(ConsoleProgressBar.scala:71)
> at
> org.apache.spark.ui.ConsoleProgressBar$$anon$1.run(ConsoleProgressBar.scala:55)
> at java.util.TimerThread.mainLoop(Timer.java:555) at
> java.util.TimerThread.run(Timer.java:505)
>
>
--
Ryan Blue
Software Engineer
Netflix
astore, can you tell me which
> version is more compatible with Spark 2.0.2 ?
>
> THanks
>
--
Ryan Blue
Software Engineer
Netflix
know dictionary of words
> if
> >> there is no schema provided by user? Where/how to specify my schema /
> >> config for Parquet format?
> >>
> >> Could not find Apache Parquet mailing list in the official site. It
> would
> >> be great if anyone could share it as well.
> >>
> >> Regards
> >> Ashok
> >>
> >
> >
>
--
Ryan Blue
Software Engineer
Netflix
11 matches
Mail list logo