!RowOrdering.isOrderable(leftKeys) =>
How is !RowOrdering.isOrderable(leftKeys) possible in the second case? I
must be missing something...again :( Please help.
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Mastering Spark SQL https://bit.ly/mastering-spark-sql
Spar
Hi,
With bucketing support enabled by default in 2.3, I think that the number
of buckets should be included in the metrics of FileSourceScanExec.
WDYT? Shall I report an enhancement in JIRA?
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Mastering Spark SQL https://bit.ly
Hi,
s,sbt ./build/sbt,./build/sbt
In other words, don't execute sbt with ./build/sbt, but ./build/sbt itself
(you don't even have to install sbt to build spark as it's included in the
repo and the script uses it internally)
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
/sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala#L483
[2]
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala#L750-L754
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Mastering
Hi,
I've ran across InterpretedProjection and InterpretedMutableProjection that
seem of no use, esp. InterpretedMutableProjection.
What's their purpose in Spark SQL? Why aren't they marked as @deprecated?
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Mastering Spark SQL https
Hi Reynold,
That in general is a very good idea to get the community engaged (even if
most people would just listen / hide in the dark like myself). I know no
other open source project at ASF or elsewhere that such an initiative was
even tried. Kudos for the idea!
Pozdrawiam,
Jacek Laskowski
the plans.
I'm wondering if I should file a task in JIRA for this or just send a pull
request? I'd appreciate some guidance.
[1]
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala#L167
Pozdrawiam,
Jacek Laskowski
the trait)?
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Mastering Spark SQL https://bit.ly/mastering-spark-sql
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Kafka Streams https://bit.ly/mastering-kafka-streams
Follow me at https://twitter.com/jaceklaskowski
ble identifier is resolvable). That would help understanding that
part of Spark SQL a little better (i.e. writing a unit test with logical
rules and such).
Should I fill an issue in JIRA for this? Any suggestions how to do it the
right way?
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskow
,
Jacek Laskowski
https://about.me/JacekLaskowski
Mastering Spark SQL https://bit.ly/mastering-spark-sql
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Kafka Streams https://bit.ly/mastering-kafka-streams
Follow me at https://twitter.com/jaceklaskowski
Hi,
http://spark.apache.org/developer-tools.html#nightly-builds reads:
> Spark nightly packages are available at:
> Latest master build:
https://people.apache.org/~pwendell/spark-nightly/spark-master-bin/latest
but the URL gives 404. Is this intended?
Pozdrawiam,
Jacek Laskowski
%93#L1393
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Mastering Spark SQL https://bit.ly/mastering-spark-sql
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Kafka Streams https://bit.ly/mastering-kafka-streams
Follow me at https://twitter.com
Thanks for looking into it, Kazuaki!
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Mastering Spark SQL https://bit.ly/mastering-spark-sql
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Kafka Streams https://bit.ly/mastering-kafka-streams
Follow
Hi Michael,
-dev +user
What's the query? How do you "fool spark"?
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Mastering Spark SQL https://bit.ly/mastering-spark-sql
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Kafka Streams http
/ResolvedDataSourceSuite.scala
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Mastering Spark SQL https://bit.ly/mastering-spark-sql
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Kafka Streams https://bit.ly/mastering-kafka-streams
Follow me at https://twitter.com
rk/sql/catalyst/plans/logical/basicLogicalOperators.scala#L895
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Mastering Spark SQL https://bit.ly/mastering-spark-sql
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Kafka Streams https://bit.ly/mastering-kaf
catalyst.plans.logical.statsEstimation.SizeInBytesOnlyStatsPlanVisitor$.default(SizeInBytesOnlyStatsPlanVisitor.scala:27)
// analyzed logical plan works fine
scala> names.queryExecution.analyzed.stats
res23: org.apache.spark.sql.catalyst.plans.logical.Statistics =
Statistics(sizeInBytes=48.0 B, hints=none)
Pozdrawiam,
Jacek
r/sql/core/src/main/scala/org/apache/spark/sql/execution/command/CommandUtils.scala#L66-L73
[2]
https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala?utf8=%E2%9C%93#L126
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
fun$mapPartitionsWithIndexInternal$1$$anonfun$apply$24.apply(RDD.scala:816)
...
Is this a bug or does it work as intended? Why?
[1]
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala?utf8=%E2%9C%93#L386
Pozdrawiam,
Jacek Laskowski
ht
Hi Sean,
What does "Not all the pieces are released yet" mean if you don't mind me
asking? 2.2.1 has already been announced, hasn't it? [1]
[1] http://spark.apache.org/news/spark-2-2-1-released.html
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Spark Structured
, but not http://spark.apache.org/docs/latest :(
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Thu, Dec 14
because --> "Disable
generate codegen since it fails my workload." - Wished he included the
workload to showcase the issue :(
Looks like there are a bunch of wise people already on it so I'll just
listen...
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Spark Structured St
in whole-stage codegen it can extend CodegenSupport
trait and enable accessing GenericInternalRow by turning supportCodegen
flag off.
I can understand how badly that can read, but without help from Spark SQL
devs that's all I can figure out myself. Any help appreciated.
Pozdrawiam,
Jacek Laskowski
/scala/org/apache/spark/sql/execution/GenerateExec.scala#L58-L64
[2]
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/GenerateExec.scala#L125
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Spark Structured Streaming https://bit.ly
Hi Satyajit,
That's exactly what Dataset.rdd does -->
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala?utf8=%E2%9C%93#L2916-L2921
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Spark Structured Streaming https://bit.ly/sp
/sql/catalyst/plans/logical/basicLogicalOperators.scala?utf8=%E2%9C%93#L890
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https:
child
[error] ^
[error] 8 errors found
[error] Compile failed at Dec 8, 2017 5:58:10 PM [8.170s]
[INFO]
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Spark Structured Streaming https://bit.ly/spark-structur
che/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala#L2092
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
Hi Sahm,
Unless I'm mistaken [1], but org.apache.spark.mllib is put on hold and is
considered @deprecated these days. That'd explain why "so many things made
private".
[1]
https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/package.scala#L21
Pozdraw
ScanExec does (and so does BroadcastExchangeExec,
but that's not a data source so may have different reasons).
[1]
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala#L31-L32
Pozdrawiam,
Jacek Laskowski
https://about.me/J
xec. Could anyone explain it in more detail? I'd appreciate.
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jacek
/localhost:4040/SQL/execution/?id=0 shows no metrics for
LocalTableScan. Is this intended?
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
/streaming/StreamingQueryManager.scala#L335
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
Hi,
Guessing it's a timing issue. Once you started the query the batch 0 did
not have rows to save or didn't start yet (it's a separate thread) and so
spark.sql ran once and saved nothing.
You should rather use foreach writer to save results to Hive.
Jacek
On 29 Sep 2017 11:36 am, "HanPan"
Hi,
Oh, yeah. Seen Tejas here and there in the commits. Well deserved.
Jacek
On 29 Sep 2017 9:58 pm, "Matei Zaharia" wrote:
Hi all,
The Spark PMC recently added Tejas Patil as a committer on the
project. Tejas has been contributing across several areas of Spark for
a
Hi,
Nice catch, Sean! Learnt this today. They did say you could learn a lot
with Spark! :)
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Spark Structured Streaming (Apache Spark 2.2+)
https://bit.ly/spark-structured-streaming
Mastering Apache Spark 2 https://bit.ly/mastering
Hi,
Please disregard my finding. It does not seem a bug, but just a small
"dead code" as "init" will never be displayed in web UI = the minimum
batch id can ever be 0 and so getBatchDescriptionString could be a
little "improved".
Sorry for the noise.
Pozdrawi
05b0ad1a504e0d6213cf9d331#diff-6532dd3b63bdab0364fbcf2303e290e4R294
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Spark Structured Streaming (Apache Spark 2.2+)
https://bit.ly/spark-structured-streaming
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at http
the state for the
key?
Example's coming up.
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Spark Structured Streaming (Apache Spark 2.2+)
https://bit.ly/spark-structured-streaming
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com
close to a
test and that I could use?
Thanks for any help you may offer!
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Spark Structured Streaming (Apache Spark 2.2+)
https://bit.ly/spark-structured-streaming
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me
://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala#L206
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Spark Structured Streaming (Apache Spark 2.2+)
https://bit.ly/spark-structured-streaming
Mastering
(but it was at least 2 days ago) :(
I'm using the master at
https://github.com/apache/spark/commit/fba9cc8466dccdcd1f6f372ea7962e7ae9e09be1.
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Spark Structured Streaming (Apache Spark 2.2+)
https://bit.ly/spark-structured-streaming
timestamp#773,value#774L]
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
-
To unsubscribe e-mail:
ommit" : 22
},
"eventTime" : {
"avg" : "2017-08-11T07:04:23.782Z",
"max" : "2017-08-11T07:04:28.282Z",
"min" : "2017-08-11T07:04:19.282Z",
"watermark" : "2017-08-11T07:04:08.282Z"
},
[1]
h
Hi,
Congrats!! Looks like Sean is gonna be less busy these days ;-)
Jacek
On 7 Aug 2017 5:53 p.m., "Matei Zaharia" wrote:
> Hi everyone,
>
> The Spark PMC recently voted to add Hyukjin Kwon and Sameer Agarwal as
> committers. Join me in congratulating both of them and
. SUCCESS [01:41 min]
[INFO] Spark Project SQL .. FAILURE [02:14 min]
Is this only me or others suffer from it too?
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/mastering
Confirmed. Thanks a lot, Sean.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Sun, Jul 16, 2017 at 3:02 PM, Sean Owen <so...@cloudera.com> wrote:
Hi,
Just noticed that 2.2.0 label is under Unreleased Versions in JIRA.
Since it's out, I think 2.2.1 and 2.3.0 are valid only. Correct?
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https
/apache/spark/sql/execution/SQLExecution.scala#L63
[2]
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala#L265
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly
Hi,
Currently WindowExec gives no metrics in the web UI's Details for Query page.
What do you think about adding the number of partitions and frames?
That could certainly be super useful, but am unsure if that's the kind
of metrics Spark SQL shows in the details.
Pozdrawiam,
Jacek Laskowski
t looks so similar
to the others [3]
[3]
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala#L2940-L2942
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow m
https://issues.apache.org/jira/browse/SPARK-20597
I'm going to send a PR soon.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Mon, May 1, 2017 at 8:26 PM
/apache/spark/sql/kafka010/KafkaSourceProvider.scala#L145
[2]
https://github.com/apache/spark/blob/master/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala#L163
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 h
in groupBy and orderBy,
but doesn't seem supported in GROUPING SETS.
What do you think about adding the features to Spark SQL?
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
-
To unsubscribe e-mail: dev-unsubscr
!
p.s. Just a side note, since Unevaluated is an Expression why not
extend from Unevaluated directly? I can understand why "extends
Expression with Unevaluable" could be very valuable, but wish I hear
what was the main motivation behind it. Thanks doubled!
Pozdrawiam,
Jacek Laskowski
hub.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L107
[3]
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/ExperimentalMethods.scala
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jacek
eyeballs the
less the number of the mistakes. If we make very fine/minor releases
often we should be able to attract more people who spend their time on
testing/verification that eventually contribute to a higher quality of
Spark.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski
+1
More smaller and more frequent releases (so major releases get even more
quality).
Jacek
On 13 Mar 2017 8:07 p.m., "Holden Karau" wrote:
> Hi Spark Devs,
>
> Spark 2.1 has been out since end of December
>
/YarnSparkHadoopUtil.scala#L270
[2]
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/Utils.scala#L2516
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com
Thanks Sean. You've again been very helpful to put the right tone to
the matters. I stand corrected and have no interest in GSoC anymore.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https
understanding of `spark.memory.offHeap.enabled` is `false` is that
it does not disable off heap memory used in Java NIO for buffers in
shuffling, RPC, etc. so the memory is always (?) more than you request
for mx using executor-memory.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski
Hi Sean,
Given that 3.0.0 is coming, removing the unused versions would be a
huge benefit from maintenance point of view. I'd support removing
support for 2.5 and earlier.
Speaking of Hadoop support, is anyone considering 3.0.0 support? Can't
find any JIRA for this.
Pozdrawiam,
Jacek Laskowski
Hi,
Is this something Spark considering? Would be nice to mark issues as
GSoC in JIRA and solicit feedback. What do you think?
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com
Hi Nicholas,
Interesting. Just on the past Monday I was introducing spark and ran into
it but thought it's my poor English skills :-) Thanks for spotting it!
(I also think that the entire welcome page begs for a face lifting - it's
from pre-2.0 days)
Jacek
On 28 Jan 2017 8:18 p.m., "Nicholas
Hi Imran,
Ok, that makes sense for performance reasons. Thanks for bearing with
me and explaining that code with so much patience. Appreciated!
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https
hand, since no one has considered it a small
duplication it could be perfectly fine (it did make the code a bit
less obvious to me).
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com
/scheduler/cluster/CoarseGrainedSchedulerBackend.scala#L211
[2]
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala#L229
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https
Wow! At long last. Congrats Burak and Holden!
p.s. I was a bit worried that the process of accepting new committers
is equally hard as passing Sean's sanity checks for PRs, but given
this it's so much easier it seems :D
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski
rc/main/scala/org/apache/spark/deploy/yarn/ExecutorRunnable.scala#L210
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/j
eploy/yarn/ApplicationMaster.scala#L434
[3]
https://github.com/apache/spark/blob/master/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala#L254
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
On Wed, Jan 18, 2017 at 8:57 AM, Jacek Laskowski <ja...@japila.pl> wrote:
> p.s. How to know when the deprecation was introduced? The last change
> is for executor blacklisting so git blame does not show what I want :(
> Any ideas?
Figured that out myself!
$ git log --topo-orde
rg/apache/spark/SparkConf.scala#L641
[2]
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rpc/RpcEnv.scala#L32
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow m
Hi Sean,
Can you elaborate on " it's actually used by Spark"? Where exactly?
I'd like to be corrected.
What about the scaladoc? Since the method's a public API, I think it
should be fixed, shouldn't it?
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Ap
epending on it (unless we go through a
> deprecation process for it).
>
> Regards,
> Mridul
>
>
> On Sat, Jan 14, 2017 at 2:02 AM, Jacek Laskowski <ja...@japila.pl> wrote:
> > Hi,
> >
> > Just noticed that TaskContext#getPartitionId [1] is not used an
ala/org/apache/spark/TaskContext.scala#L41
[2]
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ForeachSink.scala#L50
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-ap
ion.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
-
To unsubscribe e-mail: dev-unsub
/apache/spark/blob/master/core/src/main/scala/org/apache/spark/MapOutputTracker.scala#L84
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
+1
What an excellent way to offload some of your chores! I'm so much to learn
from you, Sean!
(Now since Sean seems to have a bit more time I'm gonna send few PRs hoping
he spares some time to find merits in them :))
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski
a lot.
On to digging deeper...
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Tue, Jan 3, 2017 at 10:08 PM, Imran Rashid <iras...@cloudera.com> wrote
Thanks Herman for the explanation.
I silently assume that the other points were ok since you did not object?
Correct?
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com
for sharing your notes! Gonna merge yours with mine! Thanks.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Mon, Jan 2, 2017 at 6:30 PM, Shuai Lin <linshu
(and BlockManagerMaster on the
driver) to track the shuffle locations (MapStatuses)?
Is my understanding correct? What am I missing? (I'm exploring shuffle
system currently and would appreciate comments a lot!) Thanks!
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering
Hi Yan,
I've been surprised the first time when I noticed rxin stepped back and a
new release manager stepped in. Congrats on your first ANNOUNCE!
I can only expect even more great stuff coming in to Spark from the dev
team after Reynold spared some time
Can't wait to read the changes...
> spark.range(5).groupByKey(_ % 5).count.rdd.getNumPartitions
res3: Int = 200
I'd appreciate any guidance to get the gist of this seemingly magic
number. Thanks!
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-sp
/src/main/scala/org/apache/spark/shuffle/ShuffleManager.scala#L35
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
Thanks a LOT, Michael!
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Mon, Dec 26, 2016 at 10:04 PM, Michael Gummelt
<mgumm...@mesosphere.io>
Hi Michael,
That caught my attention...
Could you please elaborate on "elastically grow and shrink CPU usage"
and how it really works under the covers? It seems that CPU usage is
just a "label" for an executor on Mesos. Where's this in the code?
Pozdrawiam,
Jacek
(and hence Broadcast) in.
WDYT?
[1] https://issues.apache.org/jira/browse/SPARK-12588
[2]
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/broadcast/BroadcastFactory.scala#L25-L30
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering
it anyway to
hunt down the "issue")?
2. Defining an override for sameResult in Range (as LocalRelation and
other logical operators)?
Somehow I feel Spark could do better. Please guide (and help me get
better at this low-level infra of Spark SQL). Thanks!
Pozdrawiam,
Jacek Laskowski
---
nge (0, 1, step=1, splits=Some(8))
== Physical Plan ==
*Project [id#26L, id#26L AS new#29L]
+- *Range (0, 1, step=1, splits=Some(8))
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at ht
/object.scala#L32
[4]
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala#L2498
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com
/org/apache/spark/sql/Column.scala#L152
[2]
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/functions.scala#L60
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me
Hi,
Just noticed the messages from the recent build of my pull request in Jenkins:
[info] Warning: Unknown ScalaCheck args provided: -oDF
I think we should fix it, right?
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering
comments to learn Spark better. Thanks.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
/spark/scheduler/DAGScheduler.scala#L1372
[2]
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1376
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow
cutors.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
-
To unsubscribe e-mail: dev-un
that code does not get compiled
unless you enable the profile explicitly. I've learnt it's not part of
the release, though.
Thanks for all the clarifications! I appreciate your patience dealing
with my questions a lot! Thanks.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering
is that LeafExpression is to mark left expressions so
children is assumed to be Nil.
Should children be final in LeafExpression? Why not? #curious
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me at https
+1
Ship it!
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Sun, Sep 25, 2016 at 12:08 AM, Reynold Xin <r...@databricks.com> wrote:
> Please vote on
101 - 200 of 333 matches
Mail list logo