Re: [External] Re: [GraphFrames Spark Package]: Why is there not a distribution for Spark 3.3?

2024-03-17 Thread Ofir Manor
Just to add - the latest version is 0.8.3, it seems to support 3.3: "Support Spark 3.3 / Scala 2.12 , Spark 3.4 / Scala 2.12 and Scala 2.13, Spark 3.5 / Scala 2.12 and Scala 2.13" Releases ยท graphframes/graphframes (github.com) Ofir

Re: tuning - Spark data serialization for cache() ?

2017-08-07 Thread Ofir Manor
Thanks a lot for the quick pointer! So, is the advice I linked to in official Spark 2.2 documentation misleading? You are saying that Spark 2.2 does not use by Java serialization? And the tip to switch to Kyro is also outdated? Ofir Manor Co-Founder & CTO | Equalum Mobile: +972-54-780

tuning - Spark data serialization for cache() ?

2017-08-07 Thread Ofir Manor
some other variations, like enabling Kyro by the tuning guide instructions, but didn't see any impact on the cached dataframe size (same tens of GBs in the UI). So any tips around that? Thanks. Ofir Manor Co-Founder & CTO | Equalum Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io

Re: Does spark 2.1.0 structured streaming support jdbc sink?

2017-04-10 Thread Ofir Manor
Also check SPARK-19478 <https://issues.apache.org/jira/browse/SPARK-19478> - JDBC sink (seems to be waiting for a review) Ofir Manor Co-Founder & CTO | Equalum Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io On Mon, Apr 10, 2017 at 10:10 AM, Hemanth Gudela <hemanth.gud..

Re: Structured Streaming - Can I start using it?

2017-03-14 Thread Ofir Manor
e (changes to monitoring, troubleshooting etc), so I think you should know what you want to achieve here and ask / prototype if current release fits it. Ofir Manor Co-Founder & CTO | Equalum Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io On Mon, Mar 13, 2017 at 9:45 PM, Michael

Re: What do I loose if I run spark without using HDFS or Zookeeper?

2016-08-25 Thread Ofir Manor
). Ofir Manor Co-Founder & CTO | Equalum Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io On Thu, Aug 25, 2016 at 11:13 PM, Mich Talebzadeh <mich.talebza...@gmail.com > wrote: > Hi Kant, > > I trust the following would be of use. > > Big Data depends on Hadoop Ecosystem

Re: ORC v/s Parquet for Spark 2.0

2016-07-28 Thread Ofir Manor
to the details can explain. Ofir Manor Co-Founder & CTO | Equalum Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io On Thu, Jul 28, 2016 at 6:49 PM, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Like anything else your mileage varies. > > ORC with Vectorised query

Re: The Future Of DStream

2016-07-27 Thread Ofir Manor
For the 2.0 release, look for "Unsupported Operations" here: http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html Also, there are bigger gaps - like no Kafka support, no way to plug user-defined sources or sinks etc Ofir Manor Co-Founder & CTO | Equalum

Re: [ANNOUNCE] Announcing Apache Spark 2.0.0

2016-07-27 Thread Ofir Manor
t;old" Spark Streaming Programming Guide, as I think many users will look for them. I had a "deep link" to that page so I haven't noticed that it is very hard to find until now. I'm referring to this page: http://spark.apache.org/docs/latest/structured-streaming-programming-guide.ht

Re: The Future Of DStream

2016-07-27 Thread Ofir Manor
someone will suggest to start a deprecation process that will eventually lead to its removal... As a user, I guess we will need to apply judgement about when to switch to Structured Streaming - each of us have a different risk/value tradeoff, based on our specific situation... Ofir Manor Co-Founder

Re: ORC v/s Parquet for Spark 2.0

2016-07-26 Thread Ofir Manor
One additional point specific to Spark 2.0 - for the alpha Structured Streaming API (only), the file sink only supports Parquet format (I'm sure that limitation will be lifted in a future release before Structured Streaming is GA): "File sink - Stores the output to a directory. As of Spark

Re: Spark, Scala, and DNA sequencing

2016-07-24 Thread Ofir Manor
Hi James, BTW - if you are into analyzing DNA with Spark, you may also be interested in ADAM: https://github.com/bigdatagenomics/adam http://bdgenomics.org/ Ofir Manor Co-Founder & CTO | Equalum Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io On Fri, Jul 22, 2016 at 10:3

Re: Timeline for supporting basic operations like groupBy, joins etc on Streaming DataFrames

2016-06-07 Thread Ofir Manor
for my use case. Ofir Manor Co-Founder & CTO | Equalum Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io On Tue, Jun 7, 2016 at 12:36 PM, Tathagata Das <tathagata.das1...@gmail.com> wrote: > 1. Not all types of joins are supported. Here is the list. > - Right outer join

Does decimal(6,-2) exists on purpose?

2016-05-26 Thread Ofir Manor
will become just nnnn. Ofir Manor Co-Founder & CTO | Equalum Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io

Re: Structured Streaming in Spark 2.0 and DStreams

2016-05-16 Thread Ofir Manor
://issues.apache.org/jira/browse/SPARK-13809 Eventually the pull request links into the design doc, that discusses the limits of updateStateByKey and mapWithState and how that will be handled... At a quick glance at the code, it seems to be used already in streaming aggregations. Just my two cents, Ofir Manor

Re: Structured Streaming in Spark 2.0 and DStreams

2016-05-15 Thread Ofir Manor
rocess - I don't know if that will land in 2.0 or only later. Hope that helps, Ofir Manor Co-Founder & CTO | Equalum Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io On Sun, May 15, 2016 at 11:58 PM, Benjamin Kim <bbuil...@gmail.com> wrote: > Hi Ofir, > > I just recently

Re: Structured Streaming in Spark 2.0 and DStreams

2016-05-15 Thread Ofir Manor
, the new event-time window processing SPARK-8360). The gap I see is mostly limited streaming sources / sinks migrated to the new (richer) API and semantics. Anyway, I'm pretty sure once 2.0 gets to RC, the documentation and examples will align with the current offering... Ofir Manor Co-Founder &