.
Thank you,
--Matt
release manager, hadoop-1
--
Chris K Wensel
ch...@concurrentinc.com
http://concurrentinc.com
downloaded 1.2.1. When 1.3.0 is
produced, since it will be beta quality, 1.2.1 will of course remain as the
stable version.
--Matt
On Mon, Aug 5, 2013 at 9:43 AM, Chris K Wensel ch...@wensel.net wrote:
any particular reason the 1.1.2 releases were pulled from the mirrors (so
quickly
MapReduce
Nutch
Oozie
Pig
Sqoop
Zookeeper
Your help is highly appreciated.
Thanks,
Iyad
--
Chris K Wensel
ch...@concurrentinc.com
http://concurrentinc.com
the problem domain.
and you can test your processing app independently of making it work in staging
or production just by swapping out taps.
http://www.cascading.org/
btw, I use IntelliJ for all my development.
cheers,
chris
--
Chris K Wensel
ch...@concurrentinc.com
http://www.concurrentinc.com
to rewrite working apps.
don't hesitate to ask questions on the list or IRC channel (#cascading)
chris
--
Chris K Wensel
ch...@concurrentinc.com
http://www.concurrentinc.com
-- Concurrent, Inc. offers mentoring, support for Cascading
: command not found
I am not even sure if I edited the hadoop-ec2-env.sh correctly. Is there any
newer tutorial for setting this up?
Thanks!
--
Chris K Wensel
ch...@concurrentinc.com
http://www.concurrentinc.com
-- Concurrent, Inc. offers mentoring, and support for Cascading
group and new application
comes along.
Thanks for your help,
Chris
--
Chris K Wensel
ch...@concurrentinc.com
http://www.concurrentinc.com
-- Concurrent, Inc. offers mentoring, support, and licensing for Cascading
to Hadoop with
Cascading as the query planner and processing engine. Some of which will ship
as products this year.
fyi, there will be a Cascalog workshop this Saturday (I'll be attending)
http://www.cascading.org/2011/02/cascalog-workshop-february-19t.html
cheers,
chris
--
Chris K Wensel
ch
.
Thanks
Sarthak Dudhara
--
Chris K Wensel
ch...@concurrentinc.com
http://www.concurrentinc.com
-- Concurrent, Inc. offers mentoring, support, and licensing for Cascading
J
www.harshj.com
--
Chris K Wensel
ch...@concurrentinc.com
http://www.concurrentinc.com
-- Concurrent, Inc. offers mentoring, support, and licensing for Cascading
data to/from RDBMSs into Hadoop
• Cascalog - a robust interactive extensible query language
cheers,
chris
--
Chris K Wensel
ch...@concurrentinc.com
http://www.concurrentinc.com
-- Concurrent, Inc. offers mentoring, support, and licensing for Cascading
@hadoop.apache.org
Subject: Re: Cascading 1.2 Released
Congrats, Chris! How are you doing?
I am getting remote Hadoop contracts in California and in NY, which feels
pretty cool.
Mark
On Wed, Dec 1, 2010 at 4:42 PM, Chris K Wensel ch...@wensel.net wrote:
We are happy to announce
the data would be too much. I
tried searching around for any tools that might help orchestrate something
like this, but did not find anything. Are there any tools I'm missing that I
should look into to?
Thanks
Jason
--
Chris K Wensel
ch...@concurrentinc.com
http://www.concurrentinc.com
://groups.google.com/group/cascading-user?hl=en.
--
Chris K Wensel
ch...@concurrentinc.com
http://www.concurrentinc.com
-- Concurrent, Inc. offers mentoring, support, and licensing for Cascading
/cascading-user?hl=en.
--
Chris K Wensel
ch...@concurrentinc.com
http://www.concurrentinc.com
-- Concurrent, Inc. offers mentoring, support, and licensing for Cascading
-- The Fringes of Scalability, Social Media,
and Computer Science
--
Chris K Wensel
ch...@concurrentinc.com
http://www.concurrentinc.com
-- Concurrent, Inc. offers mentoring, support, and licensing for Cascading
like a single process even though internally there is a unknown number
of Flows being created on the fly. (I'm running a connected component algorithm
that requires multiple Flows/passes in production now as a Riffle object)
Please feel free to fork and tweak.
ckw
--
Chris K Wensel
ch
.
Huh?
--
Chris K Wensel
ch...@concurrentinc.com
http://www.concurrentinc.com
,
Thanks
bharath.v
ug3
IIIT Hyderabad!
--
Chris K Wensel
ch...@concurrentinc.com
http://www.concurrentinc.com
Hi all,
RapLeaf is hosting a Cascading meetup on September 24th. More details
at:
http://blog.rapleaf.com/dev/?p=196
and
http://upcoming.yahoo.com/event/4421260
Hope to see you there!
chris
--
Chris K Wensel
ch...@concurrentinc.com
http://www.concurrentinc.com
Hi all,
RapLeaf is hosting a Cascading meetup on September 24th. More details
at:
http://blog.rapleaf.com/dev/?p=196
and
http://upcoming.yahoo.com/event/4421260
Hope to see you there!
chris
--
Chris K Wensel
ch...@concurrentinc.com
http://www.concurrentinc.com
Hi all,
A quick reminder that Scale Unlimited will run a 2 day Hadoop BootCamp
in Berlin on August 27th and 28th.
This 2 day course is for managers and developers who want to quickly
become experienced with Hadoop and related technologies.
The BootCamp provides training in MapReduce
Hi all,
A quick reminder that Scale Unlimited will run a 2 day Hadoop BootCamp
in Berlin on August 27th and 28th.
This 2 day course is for managers and developers who want to quickly
become experienced with Hadoop and related technologies.
The BootCamp provides training in MapReduce
one dataset, which of course will be
problematic (and won't scale) when dealing with large datasets with
large numbers of records with the same keys.
Does an efficient algorithm exist for a many-to-many reduce-side
join?
--
Chris K Wensel
ch...@concurrentinc.com
http://www.concurrentinc.com
...@inf.ed.ac.uk
wrote:
... and only in the US
Miles
2009/4/2 zhang jianfeng zjf...@gmail.com:
Does it support pig ?
On Thu, Apr 2, 2009 at 3:47 PM, Chris K Wensel ch...@wensel.net
wrote:
FYI
Amazons new Hadoop offering:
http://aws.amazon.com/elasticmapreduce/
And Cascading 1.0
, it could be job-setup/ job-
cleanup task that is running on a reduce slot. See HADOOP-3150 and
HADOOP-4261.
-Amareshwari
Chris K Wensel wrote:
May have found the answer, waiting on confirmation from users.
Turns out 0.19.0 and .1 instantiate the reducer class when the task
is actually intended
the issue seems to manifest with or without spec exec.
ckw
--
Chris K Wensel
ch...@wensel.net
http://www.cascading.org/
http://www.scaleunlimited.com/
Hey all,
Sohrab Modi and I caught the elusive St.Ack on video. Enjoy...
HBase Interview Part 1:
http://www.youtube.com/watch?v=1SLzrb2N4vI
HBase Interview Part 2:
http://www.youtube.com/watch?v=-VXRe1X9Xss
HBase Discussion:
http://www.youtube.com/watch?v=7jTYs7r2cPM
ckw
--
Chris K Wensel
for such a backup cluster.
I understand that the copy can be implemented with MR
but for now we can implement it just as a simple sequential script,
which scans the tables of the production Hbase and writes the data
to the backup Hbase.
Does it make sense?
Thank you for your cooperation,
M.
--
Chris K
to run a few HBase clients in a single JVM?
On Wed, Feb 4, 2009 at 6:28 PM, Chris K Wensel ch...@wensel.net
wrote:
Hey Michael
You could probably use Cascading to migrate data between HBase
clusters.
http://wiki.apache.org/hadoop/Hbase/Cascading
But the code currently doesn't support
classes with different class
loaders?
M.
--
Chris K Wensel
ch...@wensel.net
http://www.cascading.org/
http://www.scaleunlimited.com/
natural to develop and think in than
MapReduce.
http://www.cascading.org/
enjoy,
chris
p.s. If you have any code you want to contribute back, just stick it
on GitHub and send me a link.
--
Chris K Wensel
ch...@wensel.net
http://www.cascading.org/
http://www.scaleunlimited.com/
the list before filing an issue because it seems like
someone may have thought about this in the past.
Thanks.
Jonathan Gray
--
Chris K Wensel
ch...@wensel.net
http://www.cascading.org/
http://www.scaleunlimited.com/
natural to develop and think in than
MapReduce.
http://www.cascading.org/
enjoy,
chris
p.s. If you have any code you want to contribute back, just stick it
on GitHub and send me a link.
--
Chris K Wensel
ch...@wensel.net
http://www.cascading.org/
http://www.scaleunlimited.com/
that result from a referral, if any.
To be added to our referral list or if you have a project that might
benefit from Hadoop or related technologies, please email me directly.
This course will also be announced for open public enrollment in the
coming days.
cheers,
chris
--
Chris K Wensel
ch
that result from a referral, if any.
To be added to our referral list or if you have a project that might
benefit from Hadoop or related technologies, please email me directly.
This course will also be announced for open public enrollment in the
coming days.
cheers,
chris
--
Chris K Wensel
ch
this there
were before
I put together a patch. Seems bad Java practices to depend on shell
utilities :-). Not very platform agnostic...
Dan
--
Dan Diephouse
http://netzooid.com/blog
--
Chris K Wensel
ch...@wensel.net
http://www.cascading.org/
http://www.scaleunlimited.com/
Concurrent, Inc.
http://www.concurrentinc.com/
And finally, Advanced Hadoop and Cascading training (and consulting)
is available through Scale Unlimited:
http://www.scaleunlimited.com/
cheers,
chris
--
Chris K Wensel
ch...@wensel.net
http://www.cascading.org/
http://www.scaleunlimited.com/
it to draw
charts based on time series with fairly low latency?
Thanks!
Brock
--
Chris K Wensel
ch...@wensel.net
http://www.cascading.org/
http://www.scaleunlimited.com/
cascading a shot for what I am doing.
Cheers
Tim
On Tue, Nov 25, 2008 at 9:24 PM, Chris K Wensel [EMAIL PROTECTED]
wrote:
Hey Tim
The .configure() method is what you are looking for i believe.
It is called once per task, which in the default case, is once per
jvm.
Note Jobs are broken
cluster doing the same kind of job on different data.
Karl Anderson
[EMAIL PROTECTED]
http://monkey.org/~kra
--
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/
23, 2008, at 7:47 AM, Stuart Sierra wrote:
Hi folks,
Anybody tried scripting Hadoop on EC2 to...
1. Launch a cluster
2. Pull data from S3
3. Run a job
4. Copy results to S3
5. Terminate the cluster
... without any user interaction?
-Stuart
--
Chris K Wensel
[EMAIL PROTECTED]
http
,
again, give me a shout.
cheers,
chris
--
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/
Chris K Wensel wrote:
doh, conveniently collides with the GridGain and GridDynamics
presentations:
http://web.meetup.com/66/calendar/8561664/
Bay Area Hadoop User Group meetings are held on the third Wednesday
every month. This has been on the calendar for quite a while.
Doug
maybe I
Quick reminder to take the survey. We know more than a dozen companies
are using Hadoop. heh
http://www.scaleunlimited.com/survey.html
thanks!
chris
On Sep 8, 2008, at 10:43 AM, Chris K Wensel wrote:
Hey all
Scale Unlimited is putting together some case studies for an
upcoming
)?' It would not let me enter more than 10TB (we currently have
45TB of
data in our cluster; actual data, not a sum of disk used (with all
of its
replicas) but unique data).
Other than that, I tried :-)
On Tue, Sep 9, 2008 at 4:01 PM, Chris K Wensel [EMAIL PROTECTED]
wrote:
Quick
with Mozilla - http://enigmail.mozdev.org
iD8DBQFIxNdWYVRKCnSvzfIRAnJ0AJ9EcXzdyZgouN8q6wtad63SUHP/twCfZ88o
9km8MTJcTQxnc7bijR1Oxs0=
=79fZ
-END PGP SIGNATURE-
--
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/
results will be public.
cheers,
chris
--
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/
College, Santa
Clara, CA, Building 2, Training Rooms 34.
Agenda:
Cloud Computing Testbed - Thomas Sandholm, HP
Katta on Hadoop - Stefan Groschupf
Registration and directions: http://upcoming.yahoo.com/event/1075456/
Look forward to seeing you there!
Ajay
--
Chris K Wensel
[EMAIL
Hey all
Has anyone had success with RandomTextWriter?
I'm finding it fairly unstable on 0.16.x, haven't tried 0.17 yet though.
chris
--
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/
, at 10:08 AM, Arun C Murthy wrote:
On Jul 7, 2008, at 9:46 AM, Chris K Wensel wrote:
Hey all
Has anyone had success with RandomTextWriter?
I'm finding it fairly unstable on 0.16.x, haven't tried 0.17 yet
though.
What problems are you seeing? It seems to work fine for me...
Arun
--
Chris
Townsend St., Third Floor
San Francisco, CA 94107
--
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/
a custom
AMI with
a modified hadoop-init script right? or am I completely confused?
slitz
--
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/
How do i put something into the fs?
something like bin/hadoop fs -put input input will not work well
since s3
is not the default fs, so i tried to do bin/hadoop fs -put input
s3://ID:[EMAIL PROTECTED]/input (and some variations of it) but didn't
worked, i
always got an error complaining
to
failure against work that must get done, regardless of the amount of
work.
ckw
--
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/
from the map outputs store on the hfds.
Is there away to make the mappers store the final output in hdfs?
--
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/
)
at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)
--
hustlin, hustlin, everyday I'm hustlin
--
Chris K Wensel
similar to pig, do you care to provide your
comment
here? If map reduce programmers are to go to the next level
(scripting/query
language), which way to go?
--
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/
map/reduce jobs, all inter-related, from ~10 unique units of
work (internally lots of joins, sorts and math). I can't imagine
having written them by hand.
ckw
--
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/
be tuned to your application (cpu or io
bound).
ckw
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/
namenode
localhost: no datanode to stop
localhost: no secondarynamenode to stop
conf files in /usr/local/hadoop-0.17.0
==
# cat conf/slaves
localhost
# cat conf/masters
localhost
--
Chris Anderson
http://jchris.mfdz.com
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http
of
authentication it
would still plain http.
2.) Some kind of tunneling solution. The problem on this side is that
each of my cluster node is in a different subnet, plus the dualism
between the internal and external addresses of the nodes.
Any hints? TIA,
Andreas
Chris K Wensel
[EMAIL PROTECTED]
http
need any AWS keys etc.
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/
cheaply.
btw, the email notifying you that you have been approved may lag the
actual approval (mine did for days). might be worth trying a larger
cluster to see.
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/
, but
there are usually a couple at the end that take longer than I think
they should, and they frequently have these sorts of errors.
I'm running 20 machines on ec2 right now, with hadoop version 0.16.4.
--
James Moore | [EMAIL PROTECTED]
blog.restphone.com
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net
://wiki.apache.org/hadoop/AmazonEC2
-- Jim R. Wilson (jimbojw)
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/
. Wilson wrote:
Thanks Chris,
Where do I get this supposed image/create-hadoop-remote script? I
couldn't `find` it anywhere within the hadoop svn tree, and the link
in the hadoop wiki is broken :/
-- Jim
On Wed, May 7, 2008 at 2:04 PM, Chris K Wensel [EMAIL PROTECTED]
wrote:
You don't need 0.17
. Wilson
[EMAIL PROTECTED] wrote:
Thanks Chris,
Where do I get this supposed image/create-hadoop-remote script? I
couldn't `find` it anywhere within the hadoop svn tree, and the link
in the hadoop wiki is broken :/
-- Jim
On Wed, May 7, 2008 at 2:04 PM, Chris K Wensel [EMAIL PROTECTED]
wrote
, then post it somewhere
(like s3) and have the script access that?
-- Jim
On Wed, May 7, 2008 at 2:27 PM, Chris K Wensel [EMAIL PROTECTED]
wrote:
you do need the whole ec2 tree for the scripts to work...
On May 7, 2008, at 12:25 PM, Jim R. Wilson wrote:
Nevermind, looks like I needed
cheers,
ckw
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/
No. It just means I have no idea how appends will be implemented and
how it affects the other FileSystems.
On May 1, 2008, at 2:59 AM, Leon Mergen wrote:
On Thu, May 1, 2008 at 2:30 AM, Chris K Wensel [EMAIL PROTECTED]
wrote:
Further, once support for appends is added to Hadoop/HDFS, I
-compressed SequenceFile, with the file names as keys. Will that
work?
Thanks,
-Stuart, altlaw.org
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/
,
-stephen
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/
bin/
hadoop.
From here I do not know how to proceed?
I basically want to implement
http://developer.amazonwebservices.com/connect/entry.jspa?externalID=873
.
Hence I created a host using dyndns.
If you can help me,it will be great.
On Tue, Apr 15, 2008 at 11:30 PM, Chris K Wensel [EMAIL
is for the ganglia interface.
On Apr 11, 2008, at 2:01 PM, Nate Carlson wrote:
On Wed, 9 Apr 2008, Chris K Wensel wrote:
make sure all nodes are running in the same 'availability zone',
http://developer.amazonwebservices.com/connect/entry.jspa?externalID=1347
check!
and that you are using the new xen
|
| depriving some poor village of its idiot since
1981|
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/
, Chris K Wensel wrote:
Hey all
I pushed up a patch (and tar) for the ec2 contrib scripts that
provide support instance sizes, new zen kernels, availability zones,
concurrent clusters, resizing, ganglia, etc.
the patch can be found here:
https://issues.apache.org/jira/browse/HADOOP-2410
I
FYI, Just ran a 50 node cluster using one of the new kernels for
Fedora with all nodes forced onto the same 'availability zone' and
there were no timeouts or failed writes.
On Mar 27, 2008, at 4:16 PM, Chris K Wensel wrote:
If it's any consolation, I'm seeing similar behaviors on 0.16.0 when
AM, Prasan Ary wrote:
Chris,
What do you mean when you say boot the slaves with the master
private name ?
===
Chris K Wensel [EMAIL PROTECTED] wrote:
I found it much better to start the master first, then boot the
slaves
with the master private name.
i do not use
)
at
org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1191)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
On Mar 13, 2008, at 4:59 PM, Chris K Wensel wrote:
I don't really have these logs as i've bounce my cluster. But am
willing to ferret out anything in particular on my next
$BlockReceiver.init(DataNode.java:
1983)
at org.apache.hadoop.dfs.DataNode
$DataXceiver.writeBlock(DataNode.java:1074)
at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:938)
at java.lang.Thread.run(Thread.java:619)
On Mar 13, 2008, at 11:25 AM, Chris K Wensel wrote
.
Raghu.
Chris K Wensel wrote:
here is a reset, followed by three attempts to write the block.
2008-03-13 13:40:06,892 INFO org.apache.hadoop.dfs.DataNode:
Receiving block blk_7813471133156061911 src: /10.251.26.3:35762
dest: /10.251.26.3:50010
2008-03-13 13:40:06,957 INFO
the same group, the connectivity
seems to be limited.
3.) All AWS docs tell me that VMs in one group have no firewalls in
place.
So what is happening here? Any ideas?
Andreas
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
counter + counterName + +
group.getCounter(counterName) );
}
}
randomizeRecord.update();
}
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
are not being flushed when
the context is shut down, and the flush methods are not implemented
for the ganglia context.
Chris K Wensel wrote:
I have ganglia up on my cluster, and I definitely see some metrics
from the map/reduce tasks. But I don't see anything from the JVM
context for ganglia
never mind on the jvm. just found the typo.. frown
On Mar 6, 2008, at 3:19 PM, Chris K Wensel wrote:
actually, I don't see any jvm metrics across the cluster.
any idea how to get a local gmond to gmetad to report local
statistics? it is also accumulating slave stats just fine (minus jvm
86 matches
Mail list logo