We just updated to Spark 1.2.0 from Spark 1.1.0. We have a small framework
that we've been developing that connects various different RDDs together
based on some predefined business cases. After updating to 1.2.0, some of
the concurrency expectations about how the stages within jobs are executed
:
Looks like the number of skipped stages couldn't be formatted.
Cheers
On Wed, Jan 7, 2015 at 12:08 PM, Corey Nolet cjno...@gmail.com wrote:
We just upgraded to Spark 1.2.0 and we're seeing this in the UI.
We just upgraded to Spark 1.2.0 and we're seeing this in the UI.
lineages.
What's strange is that this bug only surfaced when I updated Spark.
On Wed, Jan 7, 2015 at 9:12 AM, Corey Nolet cjno...@gmail.com wrote:
We just updated to Spark 1.2.0 from Spark 1.1.0. We have a small framework
that we've been developing that connects various different RDDs together
---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29502/#review66725
---
On Dec. 31, 2014, 3:40 p.m., Corey Nolet wrote
://reviews.apache.org/r/29502/diff/
Testing
---
Wrote an integration test to verify that ScanDataSource is actually setting the
authorizations on the IteratorEnvironment
Thanks,
Corey Nolet
on the IteratorEnvironment
Thanks,
Corey Nolet
Hitarth,
I don't know how much direction you are looking for with regards to the
formats of the times but you can certainly read both files into the third
mapreduce job using the FileInputFormat by comma-separating the paths to
the files. The blocks for both files will essentially be unioned
driver application.
Here's the example code on github:
https://github.com/cjnolet/spark-jetty-server
On Fri, Jan 2, 2015 at 11:35 PM, Corey Nolet cjno...@gmail.com wrote:
So looking @ the actual code- I see where it looks like --class 'notused'
--jar null is set on the ClientBase.scala when yarn
Looking a little closer @ the launch_container.sh file, it appears to be
adding a $PWD/__app__.jar to the classpath but there is no __app__.jar in
the directory pointed to by PWD. Any ideas?
On Fri, Jan 2, 2015 at 4:20 PM, Corey Nolet cjno...@gmail.com wrote:
I'm trying to get a SparkContext
I'm trying to get a SparkContext going in a web container which is being
submitted through yarn-client. I'm trying two different approaches and both
seem to be resulting in the same error from the yarn nodemanagers:
1) I'm newing up a spark context direct, manually adding all the lib jars
from
2, 2015 at 5:46 PM, Corey Nolet cjno...@gmail.com wrote:
.. and looking even further, it looks like the actual command tha'ts
executed starting up the JVM to run the
org.apache.spark.deploy.yarn.ExecutorLauncher is passing in --class
'notused' --jar null.
I would assume this isn't expected
they aren't making it through.
On Fri, Jan 2, 2015 at 5:02 PM, Corey Nolet cjno...@gmail.com wrote:
Looking a little closer @ the launch_container.sh file, it appears to be
adding a $PWD/__app__.jar to the classpath but there is no __app__.jar in
the directory pointed to by PWD. Any ideas?
On Fri, Jan
on the IteratorEnvironment
Thanks,
Corey Nolet
---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29502/#review66439
---
On Dec. 31, 2014, 1:46 p.m., Corey Nolet wrote
that ScanDataSource is actually setting the
authorizations on the IteratorEnvironment
Thanks,
Corey Nolet
PRE-CREATION
Diff: https://reviews.apache.org/r/29502/diff/
Testing
---
Wrote an integration test to verify that ScanDataSource is actually setting the
authorizations on the IteratorEnvironment
Thanks,
Corey Nolet
I want to have a SparkContext inside of a web application running in Jetty
that i can use to submit jobs to a cluster of Spark executors. I am running
YARN.
Ultimately, I would love it if I could just use somethjing like
SparkSubmit.main() to allocate a bunch of resoruces in YARN when the webapp
Let's say I have an RDD which gets cached and has two children which do
something with it:
val rdd1 = ...cache()
rdd1.saveAsSequenceFile()
rdd1.groupBy()..saveAsSequenceFile()
If I were to submit both calls to saveAsSequenceFile() in thread to take
advantage of concurrency (where
If I have 2 RDDs which depend on the same RDD like the following:
val rdd1 = ...
val rdd2 = rdd1.groupBy()...
val rdd3 = rdd1.groupBy()...
If I don't cache rdd1, will it's lineage be calculated twice (one for rdd2
and one for rdd3)?
The dates of the jars were still of Dec 10th.
I figured that was because the jars were staged in Nexus on that date
(before the vote).
On Fri, Dec 19, 2014 at 12:16 PM, Ted Yu yuzhih...@gmail.com wrote:
Looking at:
http://search.maven.org/#browse%7C717101892
The dates of the jars were
Have you started tracking a CHANGES list yet (do we need to update
anything added back in 1.6.2)?
I did start a CHANGES file in the 1.6.2-SNAPSHOT branch. I figure after the
tickets settle down I'll just create a new one.
On Thu, Dec 18, 2014 at 2:05 PM, Christopher ctubb...@apache.org wrote:
is preferable.
On Tue, Dec 16, 2014 at 7:18 PM, Corey Nolet cjno...@gmail.com wrote:
I have cycles to spin the RCs- I wouldn't mind finishing the updates
(per
my notes) of the release documentation as well.
On Tue, Dec 16, 2014 at 7:11 PM, Christopher ctubb...@apache.org
wrote
Since we've been discussing cutting an rc0 for testing before we begin the
formal release process. I've moved over all the non-blocker tickets from
1.6.2 to 1.6.3 [1]. Many of the tickets that moved haven't been updated
since the 1.6.1 release. If there are tickets you feel are necessary for
I'm working on updating the Making a Release page on our website [1] with
more detailed instructions on the steps involved. Create the candidate
section references the build.sh script and I'm contemplating just removing
it altogether since it seems like, after quick discussions with a few
I have cycles to spin the RCs- I wouldn't mind finishing the updates (per
my notes) of the release documentation as well.
On Tue, Dec 16, 2014 at 7:11 PM, Christopher ctubb...@apache.org wrote:
I think it'd be good to let somebody else exercise the process a bit, but I
can make the RCs if
I've been running a job in local mode using --master local[*] and I've
noticed that, for some reason, exceptions appear to get eaten- as in, I
don't see them. If i debug in my IDE, I'll see that an exception was thrown
if I step through the code but if I just run the application, it appears
A good example of the count/sum/average can be found in our StatsCombiner
example [1]. Joins are a complicated one- your implementation of joins will
really depend on your data set and the expected sizes of each side of the
join. You can obviously always resort to joining data together on
You're going to want to use WholeRowIterator.decodeRow(entry.getKey(),
entry.getValue()) for that one. You can do:
for(EntryKey,Value entry : scanner) {
for(EntryKey,Value actualEntry :
WholeRowIterator.decodeRow(entry.getKey(), entry.getValue()).entrySet()) {
// do something with
Also talked a little about Christopher's working on a new API design:
https://github.com/ctubbsii/accumulo/blob/ACCUMULO-2589/
On Tue, Dec 9, 2014 at 11:56 PM, Josh Elser josh.el...@gmail.com wrote:
Just so you don't think I forgot, there wasn't really much to report
today. Lots of friendly
I'm looking @ this page: http://hadoop.apache.org/docs/stable/
Is it a typo that Hadoop 2.6.0 is based on 2.4.1?
Thanks.
Reading the documentation a little more closely, I'm using the wrong
terminology. I'm using stages to refer to what spark is calling a job. I
guess application (more than one spark context) is what I'm asking about
On Dec 5, 2014 5:19 PM, Corey Nolet cjno...@gmail.com wrote:
I've read
I've read in the documentation that RDDs can be run concurrently when
submitted in separate threads. I'm curious how the scheduler would handle
propagating these down to the tasks.
I have 3 RDDs:
- one RDD which loads some initial data, transforms it and caches it
- two RDDs which use the cached
+1 in case it wasn't inferred from my previous comments. As Josh stated,
I'm still confused how the veto still holds technical justification- the
changes being made aren't removing methods from the public API.
On Mon, Dec 1, 2014 at 3:42 PM, Josh Elser josh.el...@gmail.com wrote:
I still don't
I had a ticket for that awhile back and I don't believe it was ever
completed. By default, it wants to dump out new config files for
everything- have it reusing a config file would mean not re-initializing
each time and reusing the same instance id + rfiles.
ACCUMULO-1378 was the it and it looks
[
https://issues.apache.org/jira/browse/ACCUMULO-3371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14228916#comment-14228916
]
Corey Nolet commented on ACCUMULO-3371:
---
David,
http://accumulo.apache.org/1.6
Jeremy,
The PMC boards in ASF are re
On Wed, Nov 26, 2014 at 1:18 PM, Jeremy Kepner kep...@ll.mit.edu wrote:
To be effective, most boards need to be small (~5 people) and not involved
with day-to-day.
Ideally, if someone says let's bring this to the board for a decision the
collective
send an email to user-unsubscr...@hadoop.apache.org to unsubscribe.
On Wed, Nov 26, 2014 at 3:08 PM, Li Chen ahli1...@gmail.com wrote:
Please unsubscribe me, too.
Li
On Wed, Nov 26, 2014 at 3:03 PM, Sufi Nawaz s...@eaiti.com wrote:
Please suggest how to unsubscribe from this list.
Thank
I could understand the veto if the change actually caused one of the
issues mentioned above or the issue that Sean is raising. But it does not.
The eventual consistency of property updates was an issue before this
change and continues to be an issue. This JIRA did not attempt to address
the
assigning the object to a
temporary variable.
Matei
On Nov 5, 2014, at 2:54 PM, Corey Nolet cjno...@gmail.com wrote:
The closer I look @ the stack trace in the Scala shell, it appears to be
the call to toString() that is causing the construction of the Job object
to fail. Is there a ways
[
https://issues.apache.org/jira/browse/ACCUMULO-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224975#comment-14224975
]
Corey Nolet commented on ACCUMULO-1817:
---
Awesome! Given that people have been
I was playing around in the Spark shell and newing up an instance of Job
that I could use to configure the inputformat for a job. By default, the
Scala shell println's the result of every command typed. It throws an
exception when it printlns the newly created instance of Job because it
looks like
)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
On Tue, Nov 25, 2014 at 9:39 PM, Rohith Sharma K S
rohithsharm...@huawei.com wrote:
Could you give error message or stack trace?
*From:* Corey Nolet [mailto:cjno...@gmail.com]
*Sent
I was actually about to post this myself- I have a complex join that could
benefit from something like a GroupComparator vs having to do multiple
grouyBy operations. This is probably the wrong thread for a full discussion
on this but I didn't see a JIRA ticket for this or anything similar- any
Abdul,
Please send an email to user-unsubscr...@spark.apache.org
On Tue, Nov 18, 2014 at 2:05 PM, Abdul Hakeem alhak...@gmail.com wrote:
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional
, Corey Nolet cjno...@gmail.com wrote:
Josh,
My worry with a contrib module is that, historically, code which goes
moves to a contrib is just one step away from the grave.
You do have a good point. My hope was that this could be the beginning
of
our changing history so that we
I noticed Spark 1.2.0-SNAPSHOT still has 2.4.x in the pom. Since 2.5.x is
the current stable Hadoop 2.x, would it make sense for us to update the
poms?
specialization needed beyond that. The profile sets hadoop.version to
2.4.0 by default, but this can be overridden.
On Fri, Nov 14, 2014 at 3:43 PM, Corey Nolet cjno...@gmail.com wrote:
I noticed Spark 1.2.0-SNAPSHOT still has 2.4.x in the pom. Since 2.5.x is
the current stable Hadoop 2.x
+1 for adding the examples to contrib.
I was, myself, reading over this email wondering how a set of 11 separate
examples on the use of Accumulo would fit into the core codebase-
especially as more are contributed over tinme. I like the idea of giving
community members an outlet for contributing
the
community which has been stagnant with respect to new committers for about
9 months now.
Corey Nolet wrote:
+1 for adding the examples to contrib.
I was, myself, reading over this email wondering how a set of 11 separate
examples on the use of Accumulo would fit into the core codebase
I'm loading sequence files containing json blobs in the value, transforming
them into RDD[String] and then using hiveContext.jsonRDD(). It looks like
Spark reads the files twice- once when I I define the jsonRDD() and then
again when I actually make my call to hiveContext.sql().
Looking @ the
+1 (non-binding) [for original process proposal]
Greg, the first time I've seen the word ownership on this thread is in
your message. The first time the word lead has appeared in this thread is
in your message as well. I don't think that was the intent. The PMC and
Committers have a
PMC [1] is responsible for oversight and does not designate partial or full
committer. There are projects where all committers become PMC and others
where PMC is reserved for committers with the most merit (and willingness
to take on the responsibility of project oversight, releases, etc...).
I'm actually going to change my non-binding to +0 for the proposal as-is.
I overlooked some parts of the original proposal that, when reading over
them again, do not sit well with me. one of the maintainers needs to sign
off on each patch to the component, as Greg has pointed out, does seem to
place that there is a problem
is 'ln.streetnumber, which prevents the rest of the query from resolving.
If you look at the subquery ln, it is only producing two columns:
locationName and locationNumber. So streetnumber is not valid.
On Tue, Oct 28, 2014 at 8:02 PM, Corey Nolet cjno...@gmail.com
I'm trying to use a custom input format with SparkContext.newAPIHadoopRDD.
Creating the new RDD works fine but setting up the configuration file via
the static methods on input formats that require a Hadoop Job object is
proving to be difficult.
Trying to new up my own Job object with the
, Corey Nolet cjno...@gmail.com wrote:
I'm trying to use a custom input format with SparkContext.newAPIHadoopRDD.
Creating the new RDD works fine but setting up the configuration file via
the static methods on input formats that require a Hadoop Job object is
proving to be difficult.
Trying
Michael,
I should probably look closer myself @ the design of 1.2 vs 1.1 but I've
been curious why Spark's in-memory data uses the heap instead of putting it
off heap? Was this the optimization that was done in 1.2 to alleviate GC?
On Mon, Nov 3, 2014 at 8:52 PM, Shailesh Birari
I'm fairly new to spark and I'm trying to kick the tires with a few
InputFormats. I noticed the sc.hadoopRDD() method takes a mapred JobConf
instead of a MapReduce Job object. Is there future planned support for the
mapreduce packaging?
Hongbin,
Please send an email to user-unsubscr...@spark.apache.org in order to
unsubscribe from the user list.
On Fri, Oct 31, 2014 at 9:05 AM, Hongbin Liu hongbin@theice.com wrote:
Apology for having to send to all.
I am highly interested in spark, would like to stay in this mailing
at 2:19 PM, Corey Nolet cjno...@gmail.com wrote:
Is it possible to select if, say, there was an addresses field that had a
json array?
You can get the Nth item by address.getItem(0). If you want to walk
through the whole array look at LATERAL VIEW EXPLODE in HiveQL
).collect()
res0: Array[org.apache.spark.sql.Row] = Array([John])
This will double show people who have more than one matching address.
On Tue, Oct 28, 2014 at 5:52 PM, Corey Nolet cjno...@gmail.com wrote:
So it wouldn't be possible to have a json string like this:
{ name:John, age:53, locations
Am I able to do a join on an exploded field?
Like if I have another object:
{ streetNumber:2300, locationName:The Big Building} and I want to
join with the previous json by the locations[].number field- is that
possible?
On Tue, Oct 28, 2014 at 9:31 PM, Corey Nolet cjno...@gmail.com wrote
$QueryExecution.sparkPlan(SQLContext.scala:400)
On Tue, Oct 28, 2014 at 10:48 PM, Michael Armbrust mich...@databricks.com
wrote:
Can you println the .queryExecution of the SchemaRDD?
On Tue, Oct 28, 2014 at 7:43 PM, Corey Nolet cjno...@gmail.com wrote:
So this appears to work just fine:
hctx.sql(SELECT
Dylan,
I know your original post mentioned grabbing it through the client API but
there's not currently a way to do that. As Sean mentioned, you can do it if
you have access to the cluster. You can run the reflection Keith provided
by adding the files in $ACCUMULO_HOME/lib/ to your classpath and
A concrete plan and a definite version upon which the upgrade would be
applied sounds like it would benefit the community. If you plan far enough
out (as Hadoop has done) and give the community enough of a notice, I can't
see it being a problem as they would have ample time upgrade.
On Sat, Oct
I started a project to do sliding and tumbling windows in Storm. It could
be used directly or as an example.
http://github.com/calrissian/flowmix
On Oct 9, 2014 11:54 PM, 姚驰 yaoch...@163.com wrote:
Hello, I'm trying to use storm to manipulate our monitoring data, but I
don't know how to add a
The Fluo project is happy to announce the 1.0.0-alpha-1 release of Fluo.
Fluo is a transaction layer that enables incremental processing on top of
Accumulo. It integrates into Yarn using Apache Twill.
This is the first release of Fluo and is not ready for production use. We
invite developers to
I'm all for this- though I'm curious to know the thoughts about maintenance
and the design. Are we going to use thrift to tie the C++ client calls into
the server-side components? Is that going to be maintained through a
separate effort or is the plan to have the Accumulo community officially
The Apache Accumulo project is happy to announce its 1.6.1 release.
Version 1.6.1 is the most recent bug-fix release in its 1.6.x release line.
This version includes numerous bug fixes and performance improvements over
previous versions. Existing users of 1.6.x are encouraged to upgrade to
this
The Apache Accumulo project is happy to announce its 1.6.1 release.
Version 1.6.1 is the most recent bug-fix release in its 1.6.x release line.
This version includes numerous bug fixes and performance improvements over
previous versions. Existing users of 1.6.x are encouraged to upgrade to
this
I think a logo that's more friendly to place in a circle would be useful.
The Accumulo logo is very squared off.
On Thu, Oct 2, 2014 at 3:39 PM, Mike Drob mad...@cloudera.com wrote:
Yea, as an outside observer, I would have no idea what Apache A is, nor
any idea how to get more information.
wonder if it's a JVM
thing?)
On Wed, Sep 24, 2014 at 9:06 PM, Corey Nolet cjno...@gmail.com wrote:
Vote passes with 4 +1's and no -1's.
Bill, were you able to get the IT to run yet? I'm still having timeouts
on
my end as well.
On Wed, Sep 24, 2014 at 1:41 PM, Josh Elser josh.el
technically
anybody could do this, and merge it (along with the version bump to
1.6.2-SNAPSHOT commit) to 1.6.2-SNAPSHOT branch (and forward, with -sours),
if Corey doesn't have time/gets busy.
--
Christopher L Tubbs II
http://gravatar.com/ctubbsii
On Thu, Sep 25, 2014 at 2:21 PM, Corey
directories for the test
and the failsafe output.
It doesn't fail for me. It's possible that there is some edge case that
you and Bill are hitting that I'm not.
Corey Nolet wrote:
I'm seeing the behavior under Max OS X and Fedora 19 and they have
been
consistently
Bill,
I've been having that same IT issue and said the same thing It's not
happening to others. I lifted the timeout completely and it never finished.
On Wed, Sep 24, 2014 at 1:13 PM, Mike Drob mad...@cloudera.com wrote:
Any chance the IRC chats can make it only the ML for posterity?
Mike
+1
Using separate branches in this manner just adds complexity. I was
wondering myself why we needed to create separate branches when all we're
doing is tagging/deleting the already released ones. The only difference
between where one leaves off and another begins is the name of the branch.
On
Congrats!
On Mon, Sep 22, 2014 at 5:16 PM, P. Taylor Goetz ptgo...@gmail.com wrote:
I’m pleased to announce that Apache Storm has graduated to a Top-Level
Project (TLP), and I’d like to thank everyone in the Storm community for
your contributions and help in achieving this important
Congrats!
On Mon, Sep 22, 2014 at 5:16 PM, P. Taylor Goetz ptgo...@gmail.com wrote:
I’m pleased to announce that Apache Storm has graduated to a Top-Level
Project (TLP), and I’d like to thank everyone in the Storm community for
your contributions and help in achieving this important
If we are concerned with confusion about adoption of new versions, we
should make a point to articulate the purpose very clearly in each of the
announcements. I was in the combined camp an hour ago and now I'm also
thinking we should keep them separate.
On Fri, Sep 19, 2014 at 1:16 AM, Josh
, 2014 at 6:50 PM, Corey Nolet-2 [via Apache Accumulo] [hidden
email] http://user/SendEmail.jtp?type=nodenode=11303i=0 wrote:
Awesome John! It's good to have this documented for future users. Keep us
updated!
On Sun, Aug 24, 2014 at 11:05 AM, JavaHokie [hidden email]
http://user/SendEmail.jtp
Awhile ago I had written a camel adapter for storm so that spout inputs
could come from camel. Not sure how useful it would be for you but its
located here:
https://github.com/calrissian/storm-recipes/blob/master/camel/src/main/java/org/calrissian/recipes/camel/spout/CamelConsumerSpout.java
Hi
Also, Trident is a DSL for rapidly producing useful analytics in Storm and
I've been working on a DSL that makes streams processing for complex event
processing possible.
That one is located here:
https://github.com/calrissian/flowmix
On Sep 16, 2014 4:29 AM, dominique.vill...@orange.com wrote:
:)
On 9/10/14, 10:43 AM, Corey Nolet wrote:
I had posted this to the mailing list originally after a discussion
with
Christopher at the Accumulo Summit hack-a-thon and because I wanted to
get
into the release process to help out.
Josh, I still wouldn't mind getting together 1.6.1
in further.
On Fri, Aug 22, 2014 at 11:41 PM, Corey Nolet cjno...@gmail.com wrote:
Josh,
Your advice is definitely useful- I also thought about catching the
exception and retrying with a fresh batch writer but the fact that the
batch writer failure doesn't go away without being re-instantiated
I'm thinking this could be a yarn.application.classpath configuration
problem in your yarn-site.xml. I meant to ask earlier- how are you building
your jar that gets deployed? Are you shading it? Using libjars?
On Sun, Aug 24, 2014 at 6:56 AM, JavaHokie soozandjohny...@gmail.com
wrote:
Hey
Awesome John! It's good to have this documented for future users. Keep us
updated!
On Sun, Aug 24, 2014 at 11:05 AM, JavaHokie soozandjohny...@gmail.com
wrote:
Hi Corey,
Just to wrap things up, AccumuloMultipeTableInputFormat is working really
well. This is an outstanding feature I can
Awesome! I was going to recommend checking out the code last night so that
you could put some logging statements in there. You've probably noticed
this already but the MapWritable does not have static type parameters so it
dumps out the fully qualified class name so that it can instantiate it back
).
https://issues.apache.org/jira/browse/ACCUMULO-2990
On 8/22/14, 4:35 PM, Corey Nolet wrote:
Eric Keith, Chris mentioned to me that you guys have seen this issue
before. Any ideas from anyone else are much appreciated as well.
I recently updated a project's dependencies to Accumulo 1.6.0
is that all mutations added before the last flush()
happened are durable on the server. Anything else is a guess. I don't know
the specifics, but that should be enough to work with (and saving off
mutations shouldn't be too costly since they're stored serialized).
On 8/22/14, 5:44 PM, Corey Nolet
Hey John,
Could you give an example of one of the ranges you are using which causes
this to happen?
On Fri, Aug 22, 2014 at 11:02 PM, John Yost soozandjohny...@gmail.com
wrote:
Hey Everyone,
The AccumuloMultiTableInputFormat is an awesome addition to the Accumulo
API and I am really
The table configs get serialized as base64 and placed in the job's
Configuration under the key AccumuloInputFormat.ScanOpts.TableConfigs.
Could you verify/print what's being placed in this key in your
configuration?
On Sat, Aug 23, 2014 at 12:15 AM, JavaHokie soozandjohny...@gmail.com
wrote:
The tests I'm running aren't using the native Hadoop libs either. If you
don't mind, a little more code as to how you are setting up your job would
be useful. That's weird the key in the config would be null. Are you using
the job.getConfiguration()?
On Sat, Aug 23, 2014 at 12:31 AM, JavaHokie
at 1:11 AM, Corey Nolet cjno...@gmail.com wrote:
Job.getInstance(configuration) copies the configuration and makes its own.
Try doing your debug statement from earlier on job.getConfiguration() and
let's see what the base64 string looks like.
On Sat, Aug 23, 2014 at 1:00 AM, JavaHokie
That code I posted should be able to validate where you are getting hung
up. Can you try running that on the machine and seeing if it prints the
expected tables/ranges?
Also, are you running the job live? What does the configuration look like
for the job on your resource manager? Can you see if
Kafka is also distributed in nature, which is not something easily achieved
by queuing brokers like ActiveMQ or JMS (1.0) in general. Kafka allows data
to be partitioned across many machines which can grow as necessary as your
data grows.
On Thu, Aug 14, 2014 at 11:20 PM, Justin Workman
it handles IPv4/6.
Try adding the following JVM parameter when running your tests:
-Djava.net.preferIPv4Stack=true
-Taylor
On Aug 4, 2014, at 8:49 PM, Corey Nolet cjno...@gmail.com wrote:
I'm testing some sliding window algorithms with tuples emitted from a
mock spout based on a timer
Sorry- the ipv4 fix worked.
On Tue, Aug 5, 2014 at 9:13 PM, Corey Nolet cjno...@gmail.com wrote:
This did work. Thanks!
On Tue, Aug 5, 2014 at 2:23 PM, P. Taylor Goetz ptgo...@gmail.com wrote:
My guess is that the slowdown you are seeing is a result of the new
version of ZooKeeper
:
-Djava.net.preferIPv4Stack=true
-Taylor
On Aug 4, 2014, at 8:49 PM, Corey Nolet cjno...@gmail.com wrote:
I'm testing some sliding window algorithms with tuples emitted from a
mock spout based on a timer but the amount of time it takes the topology to
fully start up and activate seems to vary from
);
completeTopologyParam.setStormConf(daemonConf);
completeTopologyParam.setTopologyName(getTopologyName());
Map result = Testing.completeTopology(cluster,
topology, completeTopologyParam);
});
-Vincent
On Mon, Aug 4, 2014 at 8:49 PM, Corey Nolet cjno...@gmail.com wrote:
I'm testing some sliding window algorithms
201 - 300 of 409 matches
Mail list logo