Re: Trying to get kafka data to Hadoop

2015-03-04 Thread Joel Koshy
I think the camus mailing list would be more suitable for this
question.

Thanks,

Joel

On Wed, Mar 04, 2015 at 11:00:51AM -0500, max square wrote:
 Hi all,
 
 I have browsed through different conversations around Camus, and bring this
 as a kinda Kafka question. I know is not the most orthodox, but if someone
 has some thoughts I'd appreciate ir.
 
 That said, I am trying to set up Camus, using a 3 node Kafka cluster
 0.8.2.1, using a project that is trying to build Avro Schema-Repo
 https://github.com/schema-repo/schema-repo. All of the Avro schemas for
 the topics are published correctly. I am building Camus and using:
 
 hadoop jar camus-example-0.1.0-SNAPSHOT-shaded.jar com.linkedin.camus.etl.
 kafka.CamusJob -libjars $CAMUS_LIBJARS  -D mapreduce.job.user.classpath.
 first=true -P config.properties
 
 As the command to start the job, where I have set up an environment
 variable that holds all the libjars that the mvn package command generates.
 
 I have also set the following properties to configure the job:
 camus.message.decoder.class=com.linkedin.camus.etl.kafka.coders.
 LatestSchemaKafkaAvroMessageDecoder
 kafka.message.coder.schema.registry.class=com.linkedin.camus.schemaregistry.
 AvroRestSchemaRegistry
 
 etl.schema.registry.url=http://10.0.14.25:2876/schema-repo/
 
 When I execute the job I get an Exception indicating the
 AvroRestSchemaRegistry class can't be found (I've double checked it's part
 of the libjars). I wanted to ask if this is the correct way to set up this
 integration, and if anyone has pointers on why the job is not finding the
 class AvroRestSchemaRegistry
 
 Thanks in advance for the help!
 
 Max
 
 Follows the complete stack trace:
 
 [CamusJob] - failed to create decoder
 
 com.linkedin.camus.coders.MessageDecoderException: com.linkedin.camus.coders
 .MessageDecoderException:java.lang.ClassNotFoundException: com.linkedin.
 camus.schemaregistry.AvroRestSchemaRegistry
at com.linkedin.camus.etl.kafka.coders.MessageDecoderFactory.
 createMessageDecoder(MessageDecoderFactory.java:29)
 
   at com.linkedin.camus.etl.kafka.mapred.EtlInputFormat.createMessageDecoder
 (EtlInputFormat.java:391)
 
at com.linkedin.camus.etl.kafka.mapred.EtlInputFormat.getSplits(
 EtlInputFormat.java:256)
 
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:
 1107)
 
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1124
 )
 
  at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:178)
 
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:1023)
 
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:976)
 
at java.security.AccessController.doPrivileged(Native Method)
 
at javax.security.auth.Subject.doAs(Subject.java:415)
 
at org.apache.hadoop.security.UserGroupInformation.doAs(
 UserGroupInformation.java:1642)
 
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.
 java:976)
 
at org.apache.hadoop.mapreduce.Job.submit(Job.java:582)
 
at com.linkedin.camus.etl.kafka.CamusJob.run(CamusJob.java:335)
 
at com.linkedin.camus.etl.kafka.CamusJob.run(CamusJob.java:563)
 
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 
at com.linkedin.camus.etl.kafka.CamusJob.main(CamusJob.java:518)
 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodessorImpl.
 java:57)
 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(
 DelegatingMethodAccessorImpl.java:43)
 
at java.lang.reflect.Method.invoke(Method.



Re: Trying to get kafka data to Hadoop

2015-03-04 Thread Jagat Singh
Also see the related tool

http://confluent.io/downloads/

Confluent is bringing the glue together for Kafta , Avro , Camus

Though there is no clarity around support (e.g update of Kafta) around it
at this moment.



On Thu, Mar 5, 2015 at 8:57 AM, Joel Koshy jjkosh...@gmail.com wrote:

 I think the camus mailing list would be more suitable for this
 question.

 Thanks,

 Joel

 On Wed, Mar 04, 2015 at 11:00:51AM -0500, max square wrote:
  Hi all,
 
  I have browsed through different conversations around Camus, and bring
 this
  as a kinda Kafka question. I know is not the most orthodox, but if
 someone
  has some thoughts I'd appreciate ir.
 
  That said, I am trying to set up Camus, using a 3 node Kafka cluster
  0.8.2.1, using a project that is trying to build Avro Schema-Repo
  https://github.com/schema-repo/schema-repo. All of the Avro schemas
 for
  the topics are published correctly. I am building Camus and using:
 
  hadoop jar camus-example-0.1.0-SNAPSHOT-shaded.jar
 com.linkedin.camus.etl.
  kafka.CamusJob -libjars $CAMUS_LIBJARS  -D mapreduce.job.user.classpath.
  first=true -P config.properties
 
  As the command to start the job, where I have set up an environment
  variable that holds all the libjars that the mvn package command
 generates.
 
  I have also set the following properties to configure the job:
  camus.message.decoder.class=com.linkedin.camus.etl.kafka.coders.
  LatestSchemaKafkaAvroMessageDecoder
 
 kafka.message.coder.schema.registry.class=com.linkedin.camus.schemaregistry.
  AvroRestSchemaRegistry
 
  etl.schema.registry.url=http://10.0.14.25:2876/schema-repo/
 
  When I execute the job I get an Exception indicating the
  AvroRestSchemaRegistry class can't be found (I've double checked it's
 part
  of the libjars). I wanted to ask if this is the correct way to set up
 this
  integration, and if anyone has pointers on why the job is not finding the
  class AvroRestSchemaRegistry
 
  Thanks in advance for the help!
 
  Max
 
  Follows the complete stack trace:
 
  [CamusJob] - failed to create decoder
 
  com.linkedin.camus.coders.MessageDecoderException:
 com.linkedin.camus.coders
  .MessageDecoderException:java.lang.ClassNotFoundException: com.linkedin.
  camus.schemaregistry.AvroRestSchemaRegistry
 at com.linkedin.camus.etl.kafka.coders.MessageDecoderFactory.
  createMessageDecoder(MessageDecoderFactory.java:29)
 
at
 com.linkedin.camus.etl.kafka.mapred.EtlInputFormat.createMessageDecoder
  (EtlInputFormat.java:391)
 
 at com.linkedin.camus.etl.kafka.mapred.EtlInputFormat.getSplits(
  EtlInputFormat.java:256)
 
 at
 org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:
  1107)
 
 at
 org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1124
  )
 
   at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:178)
 
 at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:1023)
 
 at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:976)
 
 at java.security.AccessController.doPrivileged(Native Method)
 
 at javax.security.auth.Subject.doAs(Subject.java:415)
 
 at org.apache.hadoop.security.UserGroupInformation.doAs(
  UserGroupInformation.java:1642)
 
 at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.
  java:976)
 
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:582)
 
 at com.linkedin.camus.etl.kafka.CamusJob.run(CamusJob.java:335)
 
 at com.linkedin.camus.etl.kafka.CamusJob.run(CamusJob.java:563)
 
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 
 at com.linkedin.camus.etl.kafka.CamusJob.main(CamusJob.java:518)
 
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodessorImpl.
  java:57)
 
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(
  DelegatingMethodAccessorImpl.java:43)
 
 at java.lang.reflect.Method.invoke(Method.




Re: Trying to get kafka data to Hadoop

2015-03-04 Thread Lakshmanan Muthuraman
I think the libjars is not required. Maven package command for the camus
project, builds the uber jar(fat jar) which contains all the dependencies
in it. I generally run camus the following way.

hadoop jar camus-example-0.1.0-SNAPSHOT-shaded.jar
com.linkedin.camus.etl.kafka.CamusJob -P camus.properties

On Wed, Mar 4, 2015 at 2:16 PM, Jagat Singh jagatsi...@gmail.com wrote:

 Also see the related tool

 http://confluent.io/downloads/

 Confluent is bringing the glue together for Kafta , Avro , Camus

 Though there is no clarity around support (e.g update of Kafta) around it
 at this moment.



 On Thu, Mar 5, 2015 at 8:57 AM, Joel Koshy jjkosh...@gmail.com wrote:

  I think the camus mailing list would be more suitable for this
  question.
 
  Thanks,
 
  Joel
 
  On Wed, Mar 04, 2015 at 11:00:51AM -0500, max square wrote:
   Hi all,
  
   I have browsed through different conversations around Camus, and bring
  this
   as a kinda Kafka question. I know is not the most orthodox, but if
  someone
   has some thoughts I'd appreciate ir.
  
   That said, I am trying to set up Camus, using a 3 node Kafka cluster
   0.8.2.1, using a project that is trying to build Avro Schema-Repo
   https://github.com/schema-repo/schema-repo. All of the Avro schemas
  for
   the topics are published correctly. I am building Camus and using:
  
   hadoop jar camus-example-0.1.0-SNAPSHOT-shaded.jar
  com.linkedin.camus.etl.
   kafka.CamusJob -libjars $CAMUS_LIBJARS  -D
 mapreduce.job.user.classpath.
   first=true -P config.properties
  
   As the command to start the job, where I have set up an environment
   variable that holds all the libjars that the mvn package command
  generates.
  
   I have also set the following properties to configure the job:
   camus.message.decoder.class=com.linkedin.camus.etl.kafka.coders.
   LatestSchemaKafkaAvroMessageDecoder
  
 
 kafka.message.coder.schema.registry.class=com.linkedin.camus.schemaregistry.
   AvroRestSchemaRegistry
  
   etl.schema.registry.url=http://10.0.14.25:2876/schema-repo/
  
   When I execute the job I get an Exception indicating the
   AvroRestSchemaRegistry class can't be found (I've double checked it's
  part
   of the libjars). I wanted to ask if this is the correct way to set up
  this
   integration, and if anyone has pointers on why the job is not finding
 the
   class AvroRestSchemaRegistry
  
   Thanks in advance for the help!
  
   Max
  
   Follows the complete stack trace:
  
   [CamusJob] - failed to create decoder
  
   com.linkedin.camus.coders.MessageDecoderException:
  com.linkedin.camus.coders
   .MessageDecoderException:java.lang.ClassNotFoundException:
 com.linkedin.
   camus.schemaregistry.AvroRestSchemaRegistry
  at com.linkedin.camus.etl.kafka.coders.MessageDecoderFactory.
   createMessageDecoder(MessageDecoderFactory.java:29)
  
 at
  com.linkedin.camus.etl.kafka.mapred.EtlInputFormat.createMessageDecoder
   (EtlInputFormat.java:391)
  
  at com.linkedin.camus.etl.kafka.mapred.EtlInputFormat.getSplits(
   EtlInputFormat.java:256)
  
  at
  org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:
   1107)
  
  at
  org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1124
   )
  
at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:178)
  
  at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:1023)
  
  at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:976)
  
  at java.security.AccessController.doPrivileged(Native Method)
  
  at javax.security.auth.Subject.doAs(Subject.java:415)
  
  at org.apache.hadoop.security.UserGroupInformation.doAs(
   UserGroupInformation.java:1642)
  
  at
 org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.
   java:976)
  
  at org.apache.hadoop.mapreduce.Job.submit(Job.java:582)
  
  at com.linkedin.camus.etl.kafka.CamusJob.run(CamusJob.java:335)
  
  at com.linkedin.camus.etl.kafka.CamusJob.run(CamusJob.java:563)
  
  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
  
  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
  
  at com.linkedin.camus.etl.kafka.CamusJob.main(CamusJob.java:518)
  
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  
  at
  sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodessorImpl.
   java:57)
  
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(
   DelegatingMethodAccessorImpl.java:43)
  
  at java.lang.reflect.Method.invoke(Method.
 
 



RE: Trying to get kafka data to Hadoop

2015-03-04 Thread Thunder Stumpges
What branch of camus are you using? We have our own fork that we updated the 
camus dependency from the avro snapshot of the REST Schema Repository to the 
new official one you mention in github.com/schema-repo. I was not aware of a 
branch on the main linked-in camus repo that has this.

That being said, we are doing essentially this same thing however we are using 
a single shaded uber-jar. I believe the maven project builds this automatically 
doesnt it?

I'll take a look at the details of how we are invoking this on our site and get 
back to you.

Cheers,
Thunder


-Original Message-
From: max square [max2subscr...@gmail.com]
Received: Wednesday, 04 Mar 2015, 5:38PM
To: users@kafka.apache.org [users@kafka.apache.org]
Subject: Trying to get kafka data to Hadoop

Hi all,

I have browsed through different conversations around Camus, and bring this
as a kinda Kafka question. I know is not the most orthodox, but if someone
has some thoughts I'd appreciate ir.

That said, I am trying to set up Camus, using a 3 node Kafka cluster
0.8.2.1, using a project that is trying to build Avro Schema-Repo
https://github.com/schema-repo/schema-repo. All of the Avro schemas for
the topics are published correctly. I am building Camus and using:

hadoop jar camus-example-0.1.0-SNAPSHOT-shaded.jar com.linkedin.camus.etl.
kafka.CamusJob -libjars $CAMUS_LIBJARS  -D mapreduce.job.user.classpath.
first=true -P config.properties

As the command to start the job, where I have set up an environment
variable that holds all the libjars that the mvn package command generates.

I have also set the following properties to configure the job:
camus.message.decoder.class=com.linkedin.camus.etl.kafka.coders.
LatestSchemaKafkaAvroMessageDecoder
kafka.message.coder.schema.registry.class=com.linkedin.camus.schemaregistry.
AvroRestSchemaRegistry

etl.schema.registry.url=http://10.0.14.25:2876/schema-repo/

When I execute the job I get an Exception indicating the
AvroRestSchemaRegistry class can't be found (I've double checked it's part
of the libjars). I wanted to ask if this is the correct way to set up this
integration, and if anyone has pointers on why the job is not finding the
class AvroRestSchemaRegistry

Thanks in advance for the help!

Max

Follows the complete stack trace:

[CamusJob] - failed to create decoder

com.linkedin.camus.coders.MessageDecoderException: com.linkedin.camus.coders
.MessageDecoderException:java.lang.ClassNotFoundException: com.linkedin.
camus.schemaregistry.AvroRestSchemaRegistry
   at com.linkedin.camus.etl.kafka.coders.MessageDecoderFactory.
createMessageDecoder(MessageDecoderFactory.java:29)

  at com.linkedin.camus.etl.kafka.mapred.EtlInputFormat.createMessageDecoder
(EtlInputFormat.java:391)

   at com.linkedin.camus.etl.kafka.mapred.EtlInputFormat.getSplits(
EtlInputFormat.java:256)

   at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:
1107)

   at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1124
)

 at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:178)

   at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:1023)

   at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:976)

   at java.security.AccessController.doPrivileged(Native Method)

   at javax.security.auth.Subject.doAs(Subject.java:415)

   at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1642)

   at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.
java:976)

   at org.apache.hadoop.mapreduce.Job.submit(Job.java:582)

   at com.linkedin.camus.etl.kafka.CamusJob.run(CamusJob.java:335)

   at com.linkedin.camus.etl.kafka.CamusJob.run(CamusJob.java:563)

   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)

   at com.linkedin.camus.etl.kafka.CamusJob.main(CamusJob.java:518)

   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodessorImpl.
java:57)

   at sun.reflect.DelegatingMethodAccessorImpl.invoke(
DelegatingMethodAccessorImpl.java:43)

   at java.lang.reflect.Method.invoke(Method.


Re: Trying to get kafka data to Hadoop

2015-03-04 Thread max square
Thunder,

thanks for your reply. The hadoop job is now correctly configured (the
client was not getting the correct jars), however I am getting Avro
formatting exceptions due to the format the schema-repo server follows. I
think I will do something similar and create our own branch that uses the
schema repo. Any gotchas you can advice on?

Thanks!

Max

On Wed, Mar 4, 2015 at 9:24 PM, Thunder Stumpges tstump...@ntent.com
wrote:

 What branch of camus are you using? We have our own fork that we updated
 the camus dependency from the avro snapshot of the REST Schema Repository
 to the new official one you mention in github.com/schema-repo. I was
 not aware of a branch on the main linked-in camus repo that has this.

 That being said, we are doing essentially this same thing however we are
 using a single shaded uber-jar. I believe the maven project builds this
 automatically doesnt it?

 I'll take a look at the details of how we are invoking this on our site
 and get back to you.

 Cheers,
 Thunder


 -Original Message-
 From: max square [max2subscr...@gmail.com]
 Received: Wednesday, 04 Mar 2015, 5:38PM
 To: users@kafka.apache.org [users@kafka.apache.org]
 Subject: Trying to get kafka data to Hadoop

 Hi all,

 I have browsed through different conversations around Camus, and bring this
 as a kinda Kafka question. I know is not the most orthodox, but if someone
 has some thoughts I'd appreciate ir.

 That said, I am trying to set up Camus, using a 3 node Kafka cluster
 0.8.2.1, using a project that is trying to build Avro Schema-Repo
 https://github.com/schema-repo/schema-repo. All of the Avro schemas for
 the topics are published correctly. I am building Camus and using:

 hadoop jar camus-example-0.1.0-SNAPSHOT-shaded.jar com.linkedin.camus.etl.
 kafka.CamusJob -libjars $CAMUS_LIBJARS  -D mapreduce.job.user.classpath.
 first=true -P config.properties

 As the command to start the job, where I have set up an environment
 variable that holds all the libjars that the mvn package command generates.

 I have also set the following properties to configure the job:
 camus.message.decoder.class=com.linkedin.camus.etl.kafka.coders.
 LatestSchemaKafkaAvroMessageDecoder

 kafka.message.coder.schema.registry.class=com.linkedin.camus.schemaregistry.
 AvroRestSchemaRegistry

 etl.schema.registry.url=http://10.0.14.25:2876/schema-repo/

 When I execute the job I get an Exception indicating the
 AvroRestSchemaRegistry class can't be found (I've double checked it's part
 of the libjars). I wanted to ask if this is the correct way to set up this
 integration, and if anyone has pointers on why the job is not finding the
 class AvroRestSchemaRegistry

 Thanks in advance for the help!

 Max

 Follows the complete stack trace:

 [CamusJob] - failed to create decoder

 com.linkedin.camus.coders.MessageDecoderException:
 com.linkedin.camus.coders
 .MessageDecoderException:java.lang.ClassNotFoundException: com.linkedin.
 camus.schemaregistry.AvroRestSchemaRegistry
at com.linkedin.camus.etl.kafka.coders.MessageDecoderFactory.
 createMessageDecoder(MessageDecoderFactory.java:29)

   at
 com.linkedin.camus.etl.kafka.mapred.EtlInputFormat.createMessageDecoder
 (EtlInputFormat.java:391)

at com.linkedin.camus.etl.kafka.mapred.EtlInputFormat.getSplits(
 EtlInputFormat.java:256)

at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:
 1107)

at
 org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1124
 )

  at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:178)

at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:1023)

at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:976)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(
 UserGroupInformation.java:1642)

at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.
 java:976)

at org.apache.hadoop.mapreduce.Job.submit(Job.java:582)

at com.linkedin.camus.etl.kafka.CamusJob.run(CamusJob.java:335)

at com.linkedin.camus.etl.kafka.CamusJob.run(CamusJob.java:563)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)

at com.linkedin.camus.etl.kafka.CamusJob.main(CamusJob.java:518)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodessorImpl.
 java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(
 DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.



Re: Trying to get kafka data to Hadoop

2015-03-04 Thread Neha Narkhede
Thanks Jagat for the callout!

Confluent Platform 1.0 http://confluent.io/product/ includes Camus and we
were happy to address any questions in our community mailing list
confluent-platf...@googlegroups.com.



On Wed, Mar 4, 2015 at 8:41 PM, max square max2subscr...@gmail.com wrote:

 Thunder,

 thanks for your reply. The hadoop job is now correctly configured (the
 client was not getting the correct jars), however I am getting Avro
 formatting exceptions due to the format the schema-repo server follows. I
 think I will do something similar and create our own branch that uses the
 schema repo. Any gotchas you can advice on?

 Thanks!

 Max

 On Wed, Mar 4, 2015 at 9:24 PM, Thunder Stumpges tstump...@ntent.com
 wrote:

  What branch of camus are you using? We have our own fork that we updated
  the camus dependency from the avro snapshot of the REST Schema Repository
  to the new official one you mention in github.com/schema-repo. I was
  not aware of a branch on the main linked-in camus repo that has this.
 
  That being said, we are doing essentially this same thing however we are
  using a single shaded uber-jar. I believe the maven project builds this
  automatically doesnt it?
 
  I'll take a look at the details of how we are invoking this on our site
  and get back to you.
 
  Cheers,
  Thunder
 
 
  -Original Message-
  From: max square [max2subscr...@gmail.com]
  Received: Wednesday, 04 Mar 2015, 5:38PM
  To: users@kafka.apache.org [users@kafka.apache.org]
  Subject: Trying to get kafka data to Hadoop
 
  Hi all,
 
  I have browsed through different conversations around Camus, and bring
 this
  as a kinda Kafka question. I know is not the most orthodox, but if
 someone
  has some thoughts I'd appreciate ir.
 
  That said, I am trying to set up Camus, using a 3 node Kafka cluster
  0.8.2.1, using a project that is trying to build Avro Schema-Repo
  https://github.com/schema-repo/schema-repo. All of the Avro schemas
 for
  the topics are published correctly. I am building Camus and using:
 
  hadoop jar camus-example-0.1.0-SNAPSHOT-shaded.jar
 com.linkedin.camus.etl.
  kafka.CamusJob -libjars $CAMUS_LIBJARS  -D mapreduce.job.user.classpath.
  first=true -P config.properties
 
  As the command to start the job, where I have set up an environment
  variable that holds all the libjars that the mvn package command
 generates.
 
  I have also set the following properties to configure the job:
  camus.message.decoder.class=com.linkedin.camus.etl.kafka.coders.
  LatestSchemaKafkaAvroMessageDecoder
 
 
 kafka.message.coder.schema.registry.class=com.linkedin.camus.schemaregistry.
  AvroRestSchemaRegistry
 
  etl.schema.registry.url=http://10.0.14.25:2876/schema-repo/
 
  When I execute the job I get an Exception indicating the
  AvroRestSchemaRegistry class can't be found (I've double checked it's
 part
  of the libjars). I wanted to ask if this is the correct way to set up
 this
  integration, and if anyone has pointers on why the job is not finding the
  class AvroRestSchemaRegistry
 
  Thanks in advance for the help!
 
  Max
 
  Follows the complete stack trace:
 
  [CamusJob] - failed to create decoder
 
  com.linkedin.camus.coders.MessageDecoderException:
  com.linkedin.camus.coders
  .MessageDecoderException:java.lang.ClassNotFoundException: com.linkedin.
  camus.schemaregistry.AvroRestSchemaRegistry
 at com.linkedin.camus.etl.kafka.coders.MessageDecoderFactory.
  createMessageDecoder(MessageDecoderFactory.java:29)
 
at
  com.linkedin.camus.etl.kafka.mapred.EtlInputFormat.createMessageDecoder
  (EtlInputFormat.java:391)
 
 at com.linkedin.camus.etl.kafka.mapred.EtlInputFormat.getSplits(
  EtlInputFormat.java:256)
 
 at
 org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:
  1107)
 
 at
  org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1124
  )
 
   at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:178)
 
 at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:1023)
 
 at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:976)
 
 at java.security.AccessController.doPrivileged(Native Method)
 
 at javax.security.auth.Subject.doAs(Subject.java:415)
 
 at org.apache.hadoop.security.UserGroupInformation.doAs(
  UserGroupInformation.java:1642)
 
 at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.
  java:976)
 
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:582)
 
 at com.linkedin.camus.etl.kafka.CamusJob.run(CamusJob.java:335)
 
 at com.linkedin.camus.etl.kafka.CamusJob.run(CamusJob.java:563)
 
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 
 at com.linkedin.camus.etl.kafka.CamusJob.main(CamusJob.java:518)
 
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 
 at