Spark SQL question

2023-01-27 Thread Kohki Nishio
in SQL, but feel like it's a strange behavior... does anybody have a good explanation for it ? Thanks -- Kohki Nishio

Re: GC issue - Ext Root Scanning

2021-11-16 Thread Kohki Nishio
e. > Of course, reducing memory allocation in your app if possible always helps. > > > On Mon, Nov 15, 2021 at 10:18 AM Kohki Nishio wrote: > >> it's a VM, but it has 16 cores and 32 processors. >> >> -Kohki >> >> On Mon, Nov 15, 2021 at 12:53

Re: GC issue - Ext Root Scanning

2021-11-15 Thread Kohki Nishio
; > > +91 73500 12833 > deic...@gmail.com > > Facebook: https://www.facebook.com/deicool > LinkedIn: www.linkedin.com/in/deicool > > "Plant a Tree, Go Green" > > Make In India : http://www.makeinindia.com/home > > > On Mon, Nov 15, 2021 at 11:02 AM K

GC issue - Ext Root Scanning

2021-11-14 Thread Kohki Nishio
ms] [Humongous Register: 0.7 ms] [Humongous Reclaim: 0.3 ms] [Free CSet: 0.7 ms] [Eden: 8096.0M(8096.0M)->0.0B(8096.0M) Survivors: 96.0M->96.0M Heap: 23.3G(160.0G)->15.4G(160.0G)] [Times: user=23.46 sys=1.03, real=5.72 secs] -- Kohki Nishio

Re: Possibly a memory leak issue in Spark

2021-09-22 Thread Kohki Nishio
w much > metadata remains in the driver post task/stage/job competition. > > On Sep 22, 2021, at 12:42 PM, Kohki Nishio wrote: > > I believe I have enough information, raised this > > https://issues.apache.org/jira/browse/SPARK-36827 > > thanks > -Kohki > > > On

Re: Lock issue with SQLConf.getConf

2021-09-11 Thread Kohki Nishio
Awesome, thanks! On Sat, Sep 11, 2021 at 6:34 AM Sean Owen wrote: > Looks like this was improved in > https://issues.apache.org/jira/browse/SPARK-35701 for 3.2.0 > > On Fri, Sep 10, 2021 at 10:21 PM Kohki Nishio wrote: > >> Hello, >> I'm running spark in local mode

Lock issue with SQLConf.getConf

2021-09-10 Thread Kohki Nishio
t(Collections.java:2586) - waiting to lock <0x7fc901c7d9f8> (a java.util.Collections$SynchronizedMap) at org.apache.spark.sql.internal.SQLConf.getConf(SQLConf.scala:3750) at org.apache.spark.sql.internal.SQLConf.planChangeLogLevel(SQLConf.scala:3160) at org.apache.spark.sql.catalyst.rules.PlanChangeLogger.(RuleExecutor.scala:49) --- -- Kohki Nishio

Re: JavaSerializerInstance is slow

2021-09-07 Thread Kohki Nishio
gt;> I think there would definitely be interest in having a reliable and >> efficient local mode in Spark but it's a pretty different use case than >> what Spark originally focused on. >> >> Antonin >> >> On 03/09/2021 05:56, Kohki Nishio wrote: >> > I

JavaSerializerInstance is slow

2021-09-02 Thread Kohki Nishio
I'm seeing many threads doing deserialization of a task, I understand since lambda is involved, we can't use Kryo for those purposes. However I'm running it in local mode, this serialization is not really necessary, no? Is there any trick I can apply to get rid of this thread contention ? I'm

Re: Ordering pushdown for Spark Datasources

2021-04-06 Thread Kohki Nishio
age or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Sun, 4 Ap

Ordering pushdown for Spark Datasources

2021-04-04 Thread Kohki Nishio
activity for ordering pushdown. Thanks -- Kohki Nishio

DataSourceV2 with ordering pushdown

2020-12-22 Thread Kohki Nishio
? Is working with a physical plan the only way to achieve this ? Thanks -- Kohki Nishio

Re: ClassLoader problem - java.io.InvalidClassException: scala.Option; local class incompatible

2017-02-20 Thread Kohki Nishio
Created a jira, I believe SBT is a valid use case, but it's resolved as Not a Problem .. https://issues.apache.org/jira/browse/SPARK-19675 On Mon, Feb 20, 2017 at 10:36 PM, Kohki Nishio <tarop...@gmail.com> wrote: > Hello, I'm writing a Play Framework application which does Spark

ClassLoader problem - java.io.InvalidClassException: scala.Option; local class incompatible

2017-02-20 Thread Kohki Nishio
getting this. I believe ExecutorClassLoader needs to override loadClass method as well, can anyone comment on this ? It's picking up Option class from system classloader. Thanks -- Kohki Nishio

Re: Parquet partitioning for unique identifier

2015-09-04 Thread Kohki Nishio
number of rows is mostly > irrelevant. > > Cheng > > > On 9/4/15 1:24 AM, Kohki Nishio wrote: > > let's say I have a data like htis > >ID | Some1 | Some2| Some3 | > A1 | kdsfajfsa | dsafsdafa | fdsfafa | > A2 | dfsfafasd | 23jfdsjkj | 98

Re: Parquet partitioning for unique identifier

2015-09-03 Thread Kohki Nishio
ou specify partitioning column while saving data.. >> On Sep 3, 2015 5:41 AM, "Kohki Nishio" <tarop...@gmail.com> wrote: >> >>> Hello experts, >>> >>> I have a huge json file (> 40G) and trying to use Parquet as a file >>> format. E

Parquet partitioning for unique identifier

2015-09-02 Thread Kohki Nishio
uld be ideal if I could provide a partitioner based on the unique identifier value like computing its hash value or something. One of the option would be to produce a hash value and add it as a separate column, but it doesn't sound right to me. Is there any other ways I can try ? Regards, -- Kohki Nishio

FAILED_TO_UNCOMPRESS error from Snappy

2015-08-20 Thread Kohki Nishio
$.apply(Try.scala:161) at scala.util.Success.map(Try.scala:206) at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:300) at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:51) ... 33 more -- Kohki