I'm also looking forward for this partitioning function. The issue title has been changed to STORM-1464.
2016-01-26 1:38 GMT+08:00 Aaron.Dossett <aaron.doss...@target.com>: > Erik — It turned that we did need this in production after all. I updated > STORM-1494 to include partitioning and I will have an initial PR soon for > review. > > From: Erik Weathers <eweath...@groupon.com> > Reply-To: "user@storm.apache.org" <user@storm.apache.org> > Date: Monday, January 11, 2016 at 6:00 PM > To: "user@storm.apache.org" <user@storm.apache.org> > Cc: "d...@storm.apache.org" <d...@storm.apache.org> > Subject: Re: HDFS Bolts -- partitioning output > > Awesome Aaron, I can send you what we have done offline! > > - Erik > > On Thu, Jan 7, 2016 at 11:12 AM, Aaron.Dossett <aaron.doss...@target.com> > wrote: > >> Thanks, Erik. Your “Partitioner” is exactly what I had in mind and even >> what I named my stubbed out interface :-) Since Target has decided against >> this approach for other reasons, it will have to be a side project for me >> for now. >> >> Best, Aaron >> >> From: Erik Weathers <eweath...@groupon.com> >> Reply-To: "user@storm.apache.org" <user@storm.apache.org> >> Date: Wednesday, January 6, 2016 at 5:48 PM >> To: "user@storm.apache.org" <user@storm.apache.org> >> Cc: "d...@storm.apache.org" <d...@storm.apache.org> >> Subject: Re: HDFS Bolts -- partitioning output >> >> hey Aaron, >> >> We've also written a similar bolt at Groupon, we aren't super satisfied >> with the implementation though. :) We are begrudgingly using it because >> there is no partitioning support in the OSS storm-hdfs bolt. >> >> Though one thing I do like about our implementation is having the ability >> to define your own "Partitioner" in each topology to do various types of >> partitioning (date-based, message ID-based, topic-based, whatever). It >> would be great if your implementation had such logic too. e.g., when >> deciding the HDFS path for a tuple's data, the Partitioner is called to >> determine the HDFS path. For example, it can take the Tuple object and an >> opaque key/value Configuration hash that can pass items like a kafka topic >> name to be included into the HDFS path. >> >> - Erik >> >> On Tue, Dec 29, 2015 at 7:12 AM, Aaron.Dossett <aaron.doss...@target.com> >> wrote: >> >>> Hi, >>> >>> My team was exploring changes to the HDFS bolts that would allow for >>> partitioning the output, for example into directories corresponding to >>> day. This is different that the existing functionality to rotate files >>> based on a set length of time. For unrelated reasons, we are probably not >>> going to pursue this further. However, I have some code changes that >>> implement most of this functionality for at least some partitioning use >>> cases. If there is interest from the user or developer community for this >>> feature, I could get in shape for a PR to get feedback about our >>> implementation approach. >>> >>> Any feedback on this idea is welcome. Thanks! -Aaron >>> >> >> >