w if Spark is
headed in my direction.
You are implying Spark could be.
So tell me about the USE CASES and I'll do the rest.
On Tuesday, 14 April 2020 yeikel valdes wrote:
It depends on your use case. What are you trying to solve?
On Tue, 14 Apr 2020 15:36:50 -0400 jan
It depends on your use case. What are you trying to solve?
On Tue, 14 Apr 2020 15:36:50 -0400 janethor...@aol.com.INVALID wrote
Hi,
I consider myself to be quite good in Software Development especially using
frameworks.
I like to get my hands dirty. I have spent the last few mo
When I use .limit() , the number of partitions for the returning dataframe is 1
which normally fails most jobs.
val df = spark.sql("select * from table limit n")
df.write.parquet()
Thanks!
Thanks for your input Soma , but I am actually looking to understand the
differences and not only on the performance.
On Sun, 05 Apr 2020 02:21:07 -0400 somplastic...@gmail.com wrote
If you want to measure optimisation in terms of time taken , then here is an
idea :)
public
Zeppelin is not an IDE but a notebook. It is helpful to experiment but it is
missing a lot of the features that we expect from an IDE.
Thanks for sharing though.
On Tue, 07 Apr 2020 04:45:33 -0400 zahidr1...@gmail.com wrote
When I first logged on I asked if there was a suitable
I am currently using a third party library(Lucene) with Spark that is not
serializable. Due to that reason, it generates the following exception :
Job aborted due to stage failure: Task 144.0 in stage 25.0 (TID 2122) had a not
serializable result: org.apache.lucene.facet.FacetsConfig Serializa
Can you please explain what you mean with that? How do you use a udf to replace
a join? Thanks
On Mon, 24 Feb 2020 22:06:40 -0500 jianneng...@workday.com wrote
Thanks Genie. Unfortunately, the joins I'm doing in this case are large, so UDF
likely won't work.
Jianneng
From: Liu G
>From what I understand, the session is a singleton so even if you think you
>are creating new instances you are just reusing it.
On Wed, 29 Jan 2020 02:24:05 -1100 icbm0...@gmail.com wrote
Dear all
I already had a python function which is used to query data from HBase and
HDFS w
I am also interested. Many of the docs/books that I've seen are
practical/examples about usage rather than deep internals of Spark.
On Wed, 18 Sep 2019 21:12:12 -1100 vipul.s.p...@gmail.com wrote
Yes,
I realize what you were looking for, I am also looking for the same docs.
Haven
Isn't match_recognize just a filter?
df.filter(predicate)?
On Sat, 25 May 2019 12:55:47 -0700 kanth...@gmail.com wrote
Hi All,
Does Spark SQL has match_recognize? I am not sure why CEP seems to be neglected
I believe it is one of the most useful concepts in the Financial applications
What about a simple call to nanotime?
long startTime = System.nanoTime();
//Spark work here
long endTime = System.nanoTime();
long duration = (endTime - startTime)
println(duration)
Count recomputes the df so it makes sense it takes longer for you.
On Tue, 02 Apr 2019 07:06:30 -0700 kol
If you need to reduce the number of partitions you could also try df.coalesce
On Thu, 04 Apr 2019 06:52:26 -0700 jasonnerot...@gmail.com wrote
Have you tried something like this?
spark.conf.set("spark.sql.shuffle.partitions", "5" )
On Wed, Apr 3, 2019 at 8:37 PM Arthur Li wrote:
H
Not according to Parquet dev group
https://groups.google.com/forum/m/#!topic/parquet-dev/jj7TWPIUlYI
On Mon, 07 Jan 2019 05:11:51 -0800 gourav.sengu...@gmail.com wrote
Hi,
Is there any limit to the number of columns that we can have in Parquet file
format?
Thanks and Regards,
Gour
Ideally...we would like to copy paste and try in our end. A screenshot is not
enough.
If you have private information just remove and create a minimum example we can
use to replicate the issue.
I'd say similar to this :
https://stackoverflow.com/help/mcve
On Mon, 07 Jan 2019 04:15:16 -080
Shashikant Bangera | DevOps Engineer
Payment Services DevOps Engineering
Email: shashikantbang...@discover.com
Group email: eppdev...@discover.com
Tel: +44 (0)
Mob: +44 (0) 7440783885
From: yeikel valdes [mailto:em...@yeikel.com]
Sent: 07 January 2019 12:15
To: Shashikant Bangera
Cc: user
Can you call this service with regular code(No Spark)?
On Mon, 07 Jan 2019 02:42:48 -0800 shashikantbang...@discover.com wrote
Hi team,
please help , we are kind of blocked here.
Cheers,
Shashi
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
-
Forwarded Message
>From : em...@yeikel.com
To : kfehl...@gmail.com
Date : Mon, 07 Jan 2019 04:11:22 -0800
Subject : Re: Can an UDF return a custom class other than case class?
In this case I am just curious because I'd like to know if it is possible.
At the same time
Please share a minimum amount of code to try reproduce the issue...
On Mon, 07 Jan 2019 00:46:42 -0800 fyyleej...@163.com wrote
Hi all,
In my experiment program,I used spark Graphx,
when running on the Idea in windows,the result is right,
but when runing on the linux distributed clus
18 matches
Mail list logo