Re: Storm on JDK 8

2014-07-18 Thread Haralds Ulmanis
I truns on jdk8, at least it works for me.


On 18 July 2014 07:43, Anand Nalya anand.na...@gmail.com wrote:

 Hi,

 Is Storm 0.9.2 compatible with JDK 8 and can be run in production or I
 should stick with JDK 7?

 Regards,
 Anand



Messages in fly

2014-07-11 Thread Haralds Ulmanis
Does anyone know how to look-up current size of messages in fly ?
I'm pushing messages to spout and I'd like some logic to tell that cluster
is too busy.


Re: Help is processing huge data through Kafka-storm cluster

2014-06-14 Thread Haralds Ulmanis
And what about cpu/network/disk utilization ? And load factors per bolt
from storm UI ?


On 14 June 2014 15:53, Shaikh Riyaz shaikh@gmail.com wrote:

 Hi,

 Daily we are downloaded 28 Million of messages and Monthly it goes up to
 800+ million.

 We want to process this amount of data through our kafka and storm cluster
 and would like to store in HBase cluster.

 We are targeting to process one month of data in one day. Is it possible?

 We have setup our cluster thinking that we can process million of messages
 in one sec as mentioned on web. Unfortunately, we have ended-up with
 processing only 1200-1700 message per second.  if we continue with this
 speed than it will take min 10 days to process 30 days of data, which is
 the relevant solution in our case.

 I suspect that we have to change some configuration to achieve this goal.
 Looking for help from experts to support me in achieving this task.

 *Kafka Cluster:*
 Kafka is running on two dedicated machines with 48 GB of RAM and 2TB of
 storage. We have total 11 nodes kafka cluster spread across these two
 servers.

 *Kafka Configuration:*
 producer.type=async
 compression.codec=none
 request.required.acks=-1
 serializer.class=kafka.serializer.StringEncoder
 queue.buffering.max.ms=10
 batch.num.messages=1
 queue.buffering.max.messages=10
 default.replication.factor=3
 controlled.shutdown.enable=true
 auto.leader.rebalance.enable=true
 num.network.threads=2
 num.io.threads=8
 num.partitions=4
 log.retention.hours=12
 log.segment.bytes=536870912
 log.retention.check.interval.ms=6
 log.cleaner.enable=false

 *Storm Cluster:*
 Storm is running with 5 supervisor and 1 nimbus on IBM servers with 48 GB
 of RAM and 8TB of storage. These servers are shared with hbase cluster.

 *Kafka spout configuration*
 kafkaConfig.bufferSizeBytes = 1024*1024*8;
 kafkaConfig.fetchSizeBytes = 1024*1024*4;
 kafkaConfig.forceFromStart = true;

 *Topology: StormTopology*
 Spout   - Partition: 4
 First Bolt -  parallelism hint: 6 and Num tasks: 5
 Second Bolt -  parallelism hint: 5
 Third Bolt -   parallelism hint: 3
 Fourth Bolt   -  parallelism hint: 3 and Num tasks: 4
 Fifth Bolt -  parallelism hint: 3
 Sixth Bolt -  parallelism hint: 3

 *Supervisor configuration:*

 storm.local.dir: /app/storm
 storm.zookeeper.port: 2181
 storm.cluster.mode: distributed
 storm.local.mode.zmq: false
 supervisor.slots.ports:
 - 6700
 - 6701
 - 6702
 - 6703
 supervisor.worker.start.timeout.secs: 180
 supervisor.worker.timeout.secs: 30
 supervisor.monitor.frequency.secs: 3
 supervisor.heartbeat.frequency.secs: 5
 supervisor.enable: true

 storm.messaging.netty.server_worker_threads: 2
 storm.messaging.netty.client_worker_threads: 2
 storm.messaging.netty.buffer_size: 52428800 #50MB buffer
 storm.messaging.netty.max_retries: 25
 storm.messaging.netty.max_wait_ms: 1000
 storm.messaging.netty.min_wait_ms: 100


 supervisor.childopts: -Xmx1024m -Djava.net.preferIPv4Stack=true
 worker.childopts: -Xmx2048m -Djava.net.preferIPv4Stack=true


 Please let me know if more information needed..

 Thanks in advance.

 --
 Regards,

 Riyaz




Re: how to deploy storm-0.9.2-incubating to my own maven repos?

2014-06-11 Thread Haralds Ulmanis
Hi, If I remember right - you need to disable this plugin (comment out or
delete lines):
plugin
groupIdorg.apache.maven.plugins/groupId
artifactIdmaven-release-plugin/artifactId
configuration
autoVersionSubmodulestrue/autoVersionSubmodules
tagNameFormatv@{project.version}/tagNameFormat
/configuration
/plugin



On 11 June 2014 03:07, 鞠大升 dashen...@gmail.com wrote:

 In DEVELOPER.md,I found how to build the code and create a storm
 distribution:

 mvn clean install -DskipTests=true

 cd storm-dist/binary  mvn package

 But how  to deploy storm-0.9.2-incubating to my own maven repos?

 I have changed the pom.xml, distributionManagement section to my own maven
 repos, but when run mvn deploy, failed with:

 --

 Downloading:
 https://repository.apache.org/content/repositories/snapshots/org/apache/storm/storm/0.9.2-incubating-SNAPSHOT/maven-metadata.xml

 Downloaded:
 https://repository.apache.org/content/repositories/snapshots/org/apache/storm/storm/0.9.2-incubating-SNAPSHOT/maven-metadata.xml
 (2 KB at 0.2 KB/sec)

 Uploading:
 https://repository.apache.org/content/repositories/snapshots/org/apache/storm/storm/0.9.2-incubating-SNAPSHOT/storm-0.9.2-incubating-20140611.020306-2.pom
 --



 --
 dashengju
 +86 13810875910
 dashen...@gmail.com



storm-kafka external project

2014-06-02 Thread Haralds Ulmanis
First , there is small typo kind of error in:
https://github.com/apache/incubator-storm/blob/master/external/storm-kafka/src/jvm/storm/kafka/PartitionManager.java
line 217:
 if (lastCompletedOffset != lastCompletedOffset) {
i guess there should be something like  if (_committedTo !=
lastCompletedOffset) {
without that it will never save position information.

Next thing is kafka bolt.
Configuration like: get topic name from global configuration is not
probably best way - I may want to attach to bolts and each send to
different queue. Something like SpouConfig for spout probably would work
better for bolt as well (so you can pass queue name or maybe even different
zk broker list if wanted).
like:
collector.emit(stream1,new Values(v1));
collector.emit(stream2,new Values(v2));
and I'll bind each stream to different kafka bolts to get message sent to
right queue.

Haralds