Nobody has suggested removing the template.
On Tue, Mar 15, 2016 at 3:59 PM, Joseph Bradley wrote:
> +1 for keeping the template
>
> I figure any template will require conscientiousness & enforcement.
>
> On Sat, Mar 12, 2016 at 1:30 AM, Sean Owen wrote:
>>
>> The template is a great thing as it
+1 for keeping the template
I figure any template will require conscientiousness & enforcement.
On Sat, Mar 12, 2016 at 1:30 AM, Sean Owen wrote:
> The template is a great thing as it gets instructions even more right
> in front of people.
>
> Another idea is to just write a checklist of items,
Yea we are going to tighten a lot of class' visibility. A lot of APIs were
made experimental, developer, or public for no good reason in the past.
Many of them (not Logging in this case) are tied to the internal
implementation of Spark at a specific time, and no longer make sense given
the project'
oh i just noticed the big warning in spark 1.x Logging
* NOTE: DO NOT USE this class outside of Spark. It is intended as an
internal utility.
* This will likely be changed or removed in future releases.
On Tue, Mar 15, 2016 at 3:29 PM, Koert Kuipers wrote:
> makes sense
>
> note that L
makes sense
note that Logging was not private[spark] in 1.x, which is why i used it.
On Tue, Mar 15, 2016 at 12:55 PM, Marcelo Vanzin
wrote:
> Logging is a "private[spark]" class so binary compatibility is not
> important at all, because code outside of Spark isn't supposed to use
> it. Mixing
The data in the title is different, so to correct the data in the column
requires to find out what is the correct data and then replace.
To find the correct data could be tedious but if some mechanism is in place
which can help to group the partially matched data then it might help to do
the furt
+Xiangrui
On Tue, Mar 15, 2016 at 10:24 AM, Sean Owen wrote:
> Picking up this old thread, since we have the same problem updating to
> Scala 2.11.8
>
> https://github.com/apache/spark/pull/11681#issuecomment-196932777
>
> We can see the org.spark-project packages here:
>
> http://search.maven.o
Is it always the case that one title is a substring of another ? -- Not
always. One title can have values like D.O.C, doctor_{areacode},
doc_{dep,areacode}
On Mon, Mar 14, 2016 at 10:39 PM, Wail Alkowaileet
wrote:
> I think you need some sort of fuzzy join ?
> Is it always the case that one titl
Trees are immutable, and TreeNode takes care of copying unchanged parts of
the tree when you are doing transformations. As a result, even if you do
construct a DAG with the Dataset API, the first transformation will turn it
back into a tree.
The only exception to this rule is when we share the re
Picking up this old thread, since we have the same problem updating to
Scala 2.11.8
https://github.com/apache/spark/pull/11681#issuecomment-196932777
We can see the org.spark-project packages here:
http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.spark-project%22
I've forgotten who maintai
Is the SparkConf effectively a singleton? Could there be a Utils method to
return a clone of the SparkConf?
Cheers
On Tue, 15 Mar 2016 at 16:49 Marcelo Vanzin wrote:
> Oh, my bad. I think I left that from a previous part of the patch and
> forgot to revert it. Will fix.
>
> On Tue, Mar 15, 2016
Hi Amit,
I am slowly getting into it, so I will contact in a few weeks.
Jan
On Friday, March 11, 2016 09:22:27 Amit Chavan wrote:
Hi Jan,
Welcome to the group. I have used mapdb on some personal project and really
enjoyed working with it. I am also willing to contribute to the spark commun
Logging is a "private[spark]" class so binary compatibility is not
important at all, because code outside of Spark isn't supposed to use
it. Mixing Spark library versions is also not recommended, not just
because of this reason.
There have been other binary changes in the Logging class in the past
Oh, my bad. I think I left that from a previous part of the patch and
forgot to revert it. Will fix.
On Tue, Mar 15, 2016 at 7:37 AM, Koert Kuipers wrote:
> in this commit
>
> 8301fadd8d269da11e72870b7a889596e3337839
> Author: Marcelo Vanzin
> Date: Mon Mar 14 14:27:33 2016 -0700
> [SPARK-1362
i am trying to understand some parts of the catalyst optimizer. but i
struggle with one bigger picture issue:
LogicalPlan extends TreeNode, which makes sense since the optimizations
rely on tree transformations like transformUp and transformDown.
but how can a LogicalPlan be a tree? isnt it reall
Dear Spark Users and Developers,
We (Distributed (Deep) Machine Learning Community (http://dmlc.ml/)) are happy
to announce the release of XGBoost4J
(http://dmlc.ml/2016/03/14/xgboost4j-portable-distributed-xgboost-in-spark-flink-and-dataflow.html),
a Portable Distributed XGBoost in Spark, Fli
No, I don't agree that someone explicitly calling repartition or
shuffle is the same as a constructor that implicitly breaks
guarantees.
Realistically speaking, the changes you have made are also totally
incompatible with the way kafka's new consumer works. Pulling
different out-of-order chunks of
i have been using spark 2.0 snapshots with some libraries build for spark
1.0 so far (simply because it worked). in last few days i noticed this new
error:
[error] Uncaught exception when running
com.tresata.spark.sql.fieldsapi.FieldsApiSpec: java.lang.AbstractMethodError
sbt.ForkMain$ForkError: j
in this commit
8301fadd8d269da11e72870b7a889596e3337839
Author: Marcelo Vanzin
Date: Mon Mar 14 14:27:33 2016 -0700
[SPARK-13626][CORE] Avoid duplicate config deprecation warnings.
the following change was made
-class SparkConf(loadDefaults: Boolean) extends Cloneable with Logging {
+class Sp
You can achieve this with the normal RDD way. Have one extra stage in the
pipeline where you will properly standardize all the values (like replacing
doc with doctor) for all the columns before the join.
Thanks
Best Regards
On Tue, Mar 15, 2016 at 9:16 AM, Suniti Singh
wrote:
> Hi All,
>
> I ha
20 matches
Mail list logo