I don't think it works, but, there is no Hadoop 3.0 right now either. As
the version implies, it's going to be somewhat different API-wise.
On Thu, Oct 27, 2016 at 11:04 PM adam kramer wrote:
> Is the version of Spark built for Hadoop 2.7 and later only for 2.x
> releases?
>
>
Init is easy -- initialize them in your singleton.
Shutdown is harder; a shutdown hook is probably the only reliable way to go.
Global state is not ideal in Spark. Consider initializing things like
connections per partition, and open/close them with the lifecycle of a
computation on a partition
erty which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 26 October 2016 at 09:06, Sean Owen <so...@cloudera.com> wro
is email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 26 October 2016 at 06:43, Ajay Chander <itsche...@gmail.com> wrote:
>
> Sean, thank you for mak
This usage is fine, because you are only using the HiveContext locally on
the driver. It's applied in a function that's used on a Scala collection.
You can't use the HiveContext or SparkContext in a distribution operation.
It has nothing to do with for loops.
The fact that they're serializable
archive.apache.org will always have all the releases:
http://archive.apache.org/dist/spark/
On Tue, Oct 25, 2016 at 1:17 PM ayan guha wrote:
> Just in case, anyone knows how I can download Spark 1.2? It is not showing
> up in Spark download page drop down
>
>
> --
> Best
In the context of Spark, there are already things like RandomRDD and SQL
randn() to generate random standard normal variables.
If you want to do it directly, Commons Math is a good choice in the JVM,
among others.
Once you have a standard normal, just multiply by the stdev and add the
mean to
I believe it will be too late to set it there, and these are JVM flags, not
app or Spark flags. See spark.driver.extraJavaOptions and likewise for the
executor.
On Mon, Oct 24, 2016 at 4:04 PM Pietro Pugni wrote:
> Thank you!
>
> I tried again setting locale options in
ach type of machine. Maybe I could go
> beyond the limitation in the cluster. I just want to make sure I understand
> correctly that when allocating vcores, it means vcores not the threads.
>
> Thanks a lot.
>
> Best
>
>
>
> On Mon, Oct 24, 2016 at 4:55 PM, Sean Owen <
If you're really sure that 4 executors are on 1 machine, then it means your
resource manager allowed it. What are you using, YARN? check that you
really are limited to 40 cores per machine in the YARN config.
On Mon, Oct 24, 2016 at 3:33 PM TheGeorge1918 .
wrote:
> Hi
k into this too
> within coming few days..
>
> 2016-10-24 21:32 GMT+09:00 Sean Owen <so...@cloudera.com>:
>
> I actually think this is a general problem with usage of DateFormat and
> SimpleDateFormat across the code, in that it relies on the default locale
> of the
I actually think this is a general problem with usage of DateFormat and
SimpleDateFormat across the code, in that it relies on the default locale
of the JVM. I believe this needs to, at least, default consistently to
Locale.US so that behavior is consistent; otherwise it's possible that
parsing
Try adding the spark-streaming_2.11 artifact as a dependency too. You will
be directly depending on it.
On Tue, Oct 18, 2016 at 2:16 PM Furkan KAMACI
wrote:
> Hi,
>
> I have a search application and want to monitor queries per second for it.
> I have Kafka at my backend
You're now asking about couchbase code, so this isn't the best place to
ask. Head to couchbase forums.
On Mon, Oct 17, 2016 at 10:14 AM Devi P.V wrote:
> Hi,
> I tried with the following code
>
> import com.couchbase.spark._
> val conf = new SparkConf()
>
You can take the "with user-provided Hadoop" binary from the download page,
and yes that should mean it does not drag in a Hive dependency of its own.
On Mon, Oct 17, 2016 at 7:08 AM Xi Shen wrote:
> Hi,
>
> I want to configure my Hive to use Spark 2 as its engine.
Did you unpersist the broadcast objects?
On Mon, Oct 17, 2016 at 10:02 AM lev wrote:
> Hello,
>
> I'm in the process of migrating my application to spark 2.0.1,
> And I think there is some memory leaks related to Broadcast joins.
>
> the application has many unit tests,
> and
It pretty much means what it says. Objects you send across machines must be
serializable, and the object from the library is not.
You can write a wrapper object that is serializable and knows how to
serialize it. Or ask the library dev to consider making this object
serializable.
On Mon, Oct 17,
Is it just a typo in the email or are you missing a space after your
--master argument?
The logs here actually don't say much but "something went wrong". It seems
fairly low-level, like the gateway process failed or didn't start, rather
than a problem with the program. It's hard to say more
You can specify it; it just doesn't do anything but cause a warning in Java
8. It won't work in general to have such a tiny PermGen. If it's working it
means you're on Java 8 because it's ignored. You should set MaxPermSize if
anything, not PermSize. However the error indicates you are not using
The error doesn't say you're out of memory, but says you're out of PermGen.
If you see this, you aren't running Java 8 AFAIK, because 8 has no PermGen.
But if you're running Java 7, and you go investigate what this error means,
you'll find you need to increase PermGen. This is mentioned in the
I don't believe that's been released yet. It looks like it was merged into
branches about a week ago. You're looking at unreleased docs too - have a
look at http://spark.apache.org/docs/latest/ for the latest released docs.
On Thu, Oct 13, 2016 at 9:24 AM JayKay
See https://issues.apache.org/jira/browse/SPARK-17588
On Wed, Oct 12, 2016 at 9:07 PM Meeraj Kunnumpurath <
mee...@servicesymphony.com> wrote:
> If I drop the last feature on the third model, the error seems to go away.
>
> On Wed, Oct 12, 2016 at 11:52 PM, Meeraj Kunnumpurath <
>
I don't believe it will ever scale to spin up a whole distributed job to
serve one request. You can look possibly at the bits in mllib-local. You
might do well to export as something like PMML either with Spark's export
or JPMML and then load it into a web container and score it, without Spark
"Compile failed via zinc server"
Try shutting down zinc. Something's funny about your compile server.
It's not required anyway.
On Sat, Oct 1, 2016 at 3:24 PM, Marco Mistroni wrote:
> Hi guys
> sorry to annoy you on this but i am getting nowhere. So far i have tried to
>
No, I think that's what dependencyManagent (or equivalent) is definitely for.
On Thu, Sep 29, 2016 at 5:37 AM, Olivier Girardot
wrote:
> I know that the code itself would not be the same, but it would be useful to
> at least have the pom/build.sbt transitive
with the
> ones defined in the pom profile
>
>
>
> On Thu, Sep 22, 2016 11:17 AM, Sean Owen so...@cloudera.com wrote:
>
>> There can be just one published version of the Spark artifacts and they
>> have to depend on something, though in truth they'd be binary-compatible
>
I don't recall any code in Spark that computes a matrix inverse. There is
code that solves linear systems Ax = b with a decomposition. For example
from looking at the code recently, I think the regression implementation
actually solves AtAx = Atb using a Cholesky decomposition. But, A = n x k,
Yes I think that footnote could be a lot more prominent, or pulled up
right under the table.
I also think it would be fine to present the {0,1} formulation. It's
actually more recognizable, I think, for log-loss in that form. It's
probably less recognizable for hinge loss, but, consistency is
I don't think I'd enable swap on a cluster. You'd rather processes
fail than grind everything to a halt. You'd buy more memory or
optimize memory before trading it for I/O.
On Thu, Sep 22, 2016 at 6:29 PM, Michael Segel
wrote:
> Ok… gotcha… wasn’t sure that YARN just
wrote:
> Thanks for the response Sean.
>
> But how does YARN know about the off-heap memory usage?
> That’s the piece that I’m missing.
>
> Thx again,
>
> -Mike
>
>> On Sep 21, 2016, at 10:09 PM, Sean Owen <so...@cloudera.com> wrote:
>>
>> No, Xmx o
https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects
and maybe related ...
https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark
On Thu, Sep 22, 2016 at 11:15 AM, tahirhn wrote:
> I am planning to write a thesis on certain aspects (i.e testing,
There can be just one published version of the Spark artifacts and they
have to depend on something, though in truth they'd be binary-compatible
with anything 2.2+. So you merely manage the dependency versions up to the
desired version in your .
On Thu, Sep 22, 2016 at 7:05 AM, Olivier Girardot <
No, Xmx only controls the maximum size of on-heap allocated memory.
The JVM doesn't manage/limit off-heap (how could it? it doesn't know
when it can be released).
The answer is that YARN will kill the process because it's using more
memory than it asked for. A JVM is always going to use a little
Done.
On Wed, Sep 21, 2016 at 5:53 AM, Romi Kuntsman wrote:
> Hello,
> Please add a link in Spark Community page
> (https://spark.apache.org/community.html)
> To Israel Spark Meetup (https://www.meetup.com/israel-spark-users/)
> We're an active meetup group, unifying the
>> Thanks Sean.
>>
>> On Sep 20, 2016 7:45 AM, "Sean Owen" <so...@cloudera.com> wrote:
>>>
>>> Ah, I think that this was supposed to be changed with SPARK-9062. Let
>>> me see about reopening 10835 and addressing it.
>>&
Ah, I think that this was supposed to be changed with SPARK-9062. Let
me see about reopening 10835 and addressing it.
On Tue, Sep 20, 2016 at 3:24 PM, janardhan shetty
wrote:
> Is this a bug?
>
> On Sep 19, 2016 10:10 PM, "janardhan shetty" wrote:
This isn't a Spark question, so I don't think this is the right place.
It shows that compilation of rJava failed for lack of some other
shared libraries (not Java-related). I think you'd have to get those
packages installed locally too.
If it ends up being Anaconda specific, you should try
It backed the "OFF_HEAP" storage level for RDDs. That's not quite the
same thing that off-heap Tungsten allocation refers to.
It's also worth pointing out that things like HDFS also can put data
into memory already.
On Mon, Sep 19, 2016 at 7:48 PM, Richard Catlin
Yes, relevance is always 1. The label is not a relevance score so
don't think it's valid to use it as such.
On Mon, Sep 19, 2016 at 4:42 AM, Jong Wook Kim wrote:
> Hi,
>
> I'm trying to evaluate a recommendation model, and found that Spark and
> Rival give different results,
Alluxio isn't a database though; it's storage. I may be still harping
on the wrong solution for you, but as we discussed offline, that's
also what Impala, Drill et al are for.
Sorry if this was mentioned before but Ignite is what GridGain became,
if that helps.
On Sat, Sep 17, 2016 at 11:00 PM,
NoSuchFieldError in an HTTP client class?
This almost always means you have a conflicting versions of an
unshaded dependency on your classpath, and in this case could be
httpclient. You can often work around this with the userClassPathFirst
options for driver and executor.
On Sun, Sep 18, 2016 at
The result includes, essentially, all the terms in (x+y) and (x+y)^2,
and so on up if you chose a higher power. It is not just the
second-degree terms.
On Fri, Sep 16, 2016 at 7:43 PM, Nirav Patel wrote:
> Doc says:
>
> Take a 2-variable feature vector as an example: (x,
code given with spark
> to run ALS on movie lens dataset. I did not change anything in the code.
> However I am running this example on Netflix dataset (1.5 gb)
>
> Thanks,
> Roshani
>
>
> On Friday, September 16, 2016, Sean Owen <so...@cloudera.com> wrote:
>>
>
You may have to decrease the checkpoint interval to say 5 if you're
getting StackOverflowError. You may have a particularly deep lineage
being created during iterations.
No space left on device means you don't have enough local disk to
accommodate the big shuffles in some stage. You can add more
countApprox gives the best answer within some timeout. Is it possible
that 1ms is more than enough to count this exactly? then the
confidence wouldn't matter. Although that seems way too fast, you're
counting ranges whose values don't actually matter, and maybe the
Python side is smart enough to
Why Hive and why precompute data at 15 minute latency? there are
several ways here to query the source data directly with no extra step
or latency here. Even Spark SQL is real-time-ish for queries on the
source data, and Impala (or heck Drill etc) are.
On Thu, Sep 15, 2016 at 10:56 PM, Mich
If your core requirement is ad-hoc real-time queries over the data,
then the standard Hadoop-centric answer would be:
Ingest via Kafka,
maybe using Flume, or possibly Spark Streaming, to read and land the data, in...
Parquet on HDFS or possibly Kudu, and
Impala to query
>> On 15 September 2016
If it helps, I've already updated that code for the 2nd edition, which
will be based on ~Spark 2.1:
https://github.com/sryza/aas/blob/master/ch04-rdf/src/main/scala/com/cloudera/datascience/rdf/RunRDF.scala#L220
This should be an equivalent working example that deals with
categoricals via
, Pasquinell Urbani
<pasquinell.urb...@exalitica.com> wrote:
> The implicit rankings are the output of Tf-idf. I.e.:
> Each_ranking= frecuency of an ítem * log(amount of total customers/amount of
> customers buying the ítem)
>
>
> El 14 sept. 2016 17:14, "Sean Owen" <so.
enerated by TF-IDF), can this affect
> the error? (I'm currently using trainImplicit in ALS, spark 1.6.2)
>
> Thank you.
>
>
>
> 2016-09-14 16:49 GMT-03:00 Sean Owen <so...@cloudera.com>:
>
>> There is no way to answer this without knowing what your inpu
There is no way to answer this without knowing what your inputs are
like. If they're on the scale of thousands, that's small (good). If
they're on the scale of 1-5, that's extremely poor.
What's RMS vs RMSE?
On Wed, Sep 14, 2016 at 8:33 PM, Pasquinell Urbani
is defined as
> abs( C1/C1 - C2/C1 ) + abs (D1/D1 - D2/D1)
> One cannot do
> abs( (C1/C1 + D1/D1) - (C2/C1 + D2/ D1) )
>
>
> Any further tips?
>
> Best,
> Rex
>
>
>
> On Tue, Sep 13, 2016 at 11:09 AM, Sean Owen <so...@cloudera.com> wrote:
>>
Based on your description, this isn't a problem in Spark. It means
your JDBC connector isn't interpreting bytes from the database
according to the encoding in which they were written. It could be
Latin1, sure.
But if "new String(ResultSet.getBytes())" works, it's only because
your platform's
e/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and a
The key is really to specify the distance metric that defines
"closeness" for you. You have features that aren't on the same scale,
and some that aren't continuous. You might look to clustering for
ideas here, though mostly you just want to normalize the scale of
dimensions to make them
But you're in the shell there, which already has a SparkContext for you as
sc.
On Tue, Sep 13, 2016 at 6:49 PM, Kevin Burton wrote:
> I'm rather confused here as to what to do about creating a new
> SparkContext.
>
> Spark 2.0 prevents it... (exception included below)
>
>
be
> TRUE right?
>
> On Tue, Sep 6, 2016 at 1:38 PM Sean Owen <so...@cloudera.com> wrote:
>>
>> Are you not fitting an intercept / regressing through the origin? with
>> that constraint it's no longer true that R^2 is necessarily
>> nonnegative. It basically me
Are you not fitting an intercept / regressing through the origin? with
that constraint it's no longer true that R^2 is necessarily
nonnegative. It basically means that the errors are even bigger than
what you'd get by predicting the data's mean value as a constant
model.
On Tue, Sep 6, 2016 at
ze of dataset is (unsurprisingly) big.
>
> To be honest I do not really understand what do you mean by b). Since
> DataFrame is now only an alias for Dataset[Row] what do you mean by
> "DataFrame-like counterpart"?
>
> Thanks
>
> On Thu, Sep 1, 2016 at 2:31 PM, Sea
You can look into the SparkListener interface to get some of those
messages. Losing the master though is pretty fatal to all apps.
On Mon, Sep 5, 2016 at 7:30 AM, Hough, Stephen C wrote:
> I have a long running application, configured to be HA, whereby only the
>
Given recall by threshold, you can compute true positive count per
threshold by just multiplying through by the count of elements where
label = 1. From that you can get false negatives by subtracting from
that same count.
Given precision by threshold, and true positives count by threshold,
you
Spark should work fine with Python 3. I'm not a Python person, but all else
equal I'd use 3.5 too. I assume the issue could be libraries you want that
don't support Python 3. I don't think that changes with CDH. It includes a
version of Anaconda from Continuum, but that lays down Python 2.7.11. I
On Thu, Sep 1, 2016 at 4:56 PM, Mich Talebzadeh
wrote:
> Data Frame built on top of RDD to create as tabular format that we all love
> to make the original build easily usable (say SQL like queries, column
> headings etc). The drawback is it restricts you with what you
Here's my paraphrase:
Datasets are really the new RDDs. They have a similar nature
(container of strongly-typed objects) but bring some optimizations via
Encoders for common types.
DataFrames are different from RDDs and Datasets and do not replace and
are not replaced by them. They're
Yeah there's a method to predict one Vector in the .mllib API but not
the newer one. You could possibly hack your way into calling it
anyway, or just clone the logic.
On Thu, Sep 1, 2016 at 2:37 PM, Nick Pentreath wrote:
> Right now you are correct that Spark ML APIs do
use it on a
> row by row basis?
>
> Thanks for your inputs.
>
> On Thu, Sep 1, 2016 at 6:15 PM, Sean Owen <so...@cloudera.com> wrote:
>>
>> If you're trying to score a single example by way of an RDD or
>> Dataset, then no it will never be that fast. It's a whole dis
If you're trying to score a single example by way of an RDD or
Dataset, then no it will never be that fast. It's a whole distributed
operation, and while you might manage low latency for one job at a
time, consider what will happen when hundreds of them are running at
once. It's just huge overkill
You can always call .rdd.top(n) of course. Although it's slightly
clunky, you can also .orderBy($"value".desc).take(n). Maybe there's an
easier way.
I don't think if there's a strong reason other than it wasn't worth it
to write this and many other utility wrappers that a) already exist on
the
I can't think of a situation where it would be materially different.
Both are using the JVM-based APIs directly. Here and there there's a
tiny bit of overhead in using the Java APIs because something is
translated from a Java-style object to a Scala-style object, but this
is generally trivial.
On
--jars includes a local JAR file in the application's classpath.
--package references Maven coordinates of a dependency and retrieves
and includes all of those JAR files, and includes them in the app
classpath.
On Thu, Sep 1, 2016 at 10:24 AM, Divya Gehlot wrote:
> Hi,
>
Weird, I recompiled Spark with a similar change to Model and it seemed
to work but maybe I missed a step in there.
On Wed, Aug 31, 2016 at 6:33 AM, Mohit Jaggi wrote:
> I think I figured it out. There is indeed "something deeper in Scala” :-)
>
> abstract class A {
> def
I think it's imitating, for example, how Enum is delcared in Java:
abstract class Enum>
this is done so that Enum can refer to the actual type of the derived
enum class when declaring things like public final int compareTo(E o)
to implement Comparable. The type is redundant in a sense, because
If something isn't public, then it could change across even
maintenance releases. Although you can indeed still access it in some
cases by writing code in the same package, you're taking some risk
that it will stop working across releases.
If it's not public, the message is that you should build
oint.
>
> Also which option would be better, store the output of RDD to a persistent
> storage, or store the new RDD of that ouput itself using checkpoint.
>
> Thanks
> Sachin
>
>
>
>
> On Mon, Aug 29, 2016 at 1:39 PM, Sean Owen <so...@cloudera.com>
You just save the data in the RDD in whatever form you want to
whatever persistent storage you want, and then re-read it from another
job. This could be Parquet format on HDFS for example. Parquet is just
a common file format. There is no need to keep the job running just to
keep an RDD alive.
On
No, it is just being truncated for display as the ... implies. Pass
truncate=false to the show command.
On Sun, Aug 28, 2016, 15:24 Kevin Tran wrote:
> Hi,
> I wrote to parquet file as following:
>
> ++
> |word|
> ++
>
Without a distributed storage system, your application can only create data
on the driver and send it out to the workers, and collect data back from
the workers. You can't read or write data in a distributed way. There are
use cases for this, but pretty limited (unless you're running on 1
Sqoop is probably the more mature tool for the job. It also just does
one thing. The argument for doing it in Spark would be wanting to
integrate it with a larger workflow. I imagine Sqoop would be more
efficient and flexible for just the task of ingest, including
continuously pulling deltas which
We're probably mixing up some semantics here. An RDD is indeed,
really, just some bookkeeping that records how a certain result is
computed. It is not the data itself.
However we often talk about "persisting an RDD" which means
"persisting the result of computing the RDD" in which case that
You are attempting to read a tar file. That won't work. A compressed JSON
file would.
On Sun, Aug 21, 2016, 12:52 Chua Jie Sheng wrote:
> Hi Spark user list!
>
> I have been encountering corrupted records when reading Gzipped files that
> contains more than one file.
>
>
Yes, have a look through JIRA in cases like this.
https://issues.apache.org/jira/browse/SPARK-16664
On Sat, Aug 20, 2016 at 1:57 AM, mhornbech wrote:
> I did some extra digging. Running the query "select column1 from myTable" I
> can reproduce the problem on a frame with a
Historically, minor releases happen every ~4 months, and maintenance
releases are a bit ad hoc but come about a month after the minor
release. It's up to the release manager to decide to do them but maybe
realistic to expect 2.0.1 in early September.
On Thu, Aug 18, 2016 at 10:35 AM, Adrian
I'd say that Datasets, not DataFrames, are the natural evolution of
RDDs. DataFrames are for inherently tabular data, and most naturally
manipulated by SQL-like operations. Datasets operate on programming
language objects like RDDs.
So, RDDs to DataFrames isn't quite apples-to-apples to begin
-dev (this is appropriate for user@)
Probably https://issues.apache.org/jira/browse/SPARK-10141 or
https://issues.apache.org/jira/browse/SPARK-11334 but those aren't
resolved. Feel free to jump in.
On Mon, Aug 15, 2016 at 8:13 PM, Rachana Srivastava <
rachana.srivast...@markmonitor.com> wrote:
Class imbalance can be an issue for algorithms, but decision forests
should in general cope reasonably well with imbalanced classes. By
default, positive and negative classes are treated 'equally' however,
and that may not reflect reality in some cases. Upsampling the
under-represented case is a
11, 2016 at 11:02 AM, Sean Owen <so...@cloudera.com> wrote:
> No, that doesn't describe the change being discussed, since you've
> copied the discussion about adding an 'offset'. That's orthogonal.
> You're also suggesting making withMean=True the default, which we
> don
seFeatures)
>
> Thanks,
> Tobi
>
>
> On Wed, Aug 10, 2016 at 1:01 PM, Nick Pentreath <nick.pentre...@gmail.com>
> wrote:
>>
>> Ah right, got it. As you say for storage it helps significantly, but for
>> operations I suspect it puts one back in a "dense
an
optimization.
On Wed, Aug 10, 2016, 18:10 Nick Pentreath <nick.pentre...@gmail.com> wrote:
> Sean by 'offset' do you mean basically subtracting the mean but only from
> the non-zero elements in each row?
> On Wed, 10 Aug 2016 at 19:02, Sean Owen <so...@cloudera.com> wrote:
&g
> standardization, as opposed to people thinking they are standardizing when
> they actually are not.
>
> Can anyone confirm whether there is a jira already?
>
> On Wed, Aug 10, 2016 at 10:58 AM, Sean Owen <so...@cloudera.com> wrote:
>>
>> Dense vs sparse is
Dense vs sparse is just a question of representation, so doesn't make
an operation on a vector more or less important as a result. You've
identified the reason that subtracting the mean can be undesirable: a
notionally billion-element sparse vector becomes too big to fit in
memory at once.
I know
t I am confused based on the results above
> and I am wondering what factors should be removed to get a meaningful result
> (may be with 5% less accuracy)
>
> Will appreciate any help here.
>
> -Rohit
>
> On Tue, Aug 9, 2016 at 12:55 PM, Sean Owen <so...@cloudera.com>
Nightlies are built and made available in the ASF snapshot repo, from
master. This is noted at the bottom of the downloads page, and at
https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools#UsefulDeveloperTools-NightlyBuilds
. This hasn't changed in as long as I can recall.
Chaddha <rohitchaddha1...@gmail.com> wrote:
> I would rather have less features to make better inferences on the data
> based on the smaller number of factors,
> Any suggestions Sean ?
>
> On Mon, Aug 8, 2016 at 11:37 PM, Sean Owen <so...@cloudera.com> wrote:
>
&
I also don't know what's going on with the "This post has NOT been
accepted by the mailing list yet" message, because actually the
messages always do post. In fact this has been sent to the list 4
times:
https://www.mail-archive.com/search?l=user%40spark.apache.org=dueckm=0=0
On Mon, Aug 8, 2016
In case the attachments don't come through, BTW those are indeed
downloadable from the directory http://spark.apache.org/images/
On Mon, Aug 8, 2016 at 6:09 PM, Sivakumaran S wrote:
> Found these from the spark.apache.org website.
>
> HTH,
>
> Sivakumaran S
>
>
>
>
>
> On
That message is a warning, not error. It is just because you're cross
compiling with Java 8. If something failed it was elsewhere.
On Thu, Aug 4, 2016, 07:09 Richard Siebeling wrote:
> Hi,
>
> spark 2.0 with mapr hadoop libraries was succesfully build using the
> following
You're looking for http://bahir.apache.org/
On Wed, Aug 3, 2016 at 8:40 PM, Kiran Chitturi
wrote:
> Hi,
>
> When Spark 2.0.0 is released, the 'spark-streaming-twitter' package and
> several other packages are not released/published to maven central. It looks
> like
3:0.0"
> I thought I need to follow the same numbering while creating vector too.
>
> thanks a bunch
>
>
> On Thu, Aug 4, 2016 at 12:39 AM, Sean Owen <so...@cloudera.com> wrote:
>>
>> You mean "new int[] {0,1,2}" because vectors are 0-indexed.
&
model with a 3 dimension vector ?
> I am not sre what is wrong in this approach. i am missing a point ?
>
> Tony
>
> On Wed, Aug 3, 2016 at 11:22 PM, Sean Owen <so...@cloudera.com> wrote:
>>
>> You declare that the vector has 3 dimensions, but then refer to its
>&
file: "absolute directory"
does not sound like a valid URI
On Wed, Aug 3, 2016 at 11:05 AM, Flavio wrote:
> Hello everyone,
>
> I am try to run a very easy example but unfortunately I am stuck on the
> follow exception:
>
> Exception in thread "main"
601 - 700 of 1849 matches
Mail list logo