Hi RJ. 

Maven repositories and docker containers for the transaction queue are good 
enough IMO.  That will give people a way to compose them in different idioms 
(one for Java folks, another for broader Linux audience ). 

I think the lib designs are fairly intuitive.  I would say that we should 
constrain them all to being written in Java or Groovy to keep the bigtop theme 
of "JVM for everything" :).

Any particular questions you have around technical design can be followed in a 
JIRA or else maybe a Readme spec that goes in a  top level of the 
data-generators dir...

> On Aug 30, 2015, at 1:51 AM, RJ Nowling <[email protected]> wrote:
> 
> I'd like to keep this conversation going.
> 
> So here are a few discussion points:
> 
> 1. How do we want to make the data generators available?  Maven?  RPMs and
> Debs?
> 
> For now, I'm using a gradle multi-project build to easily build and install
> the BPS data generators and its libraries into a local maven repo.  This
> makes development easy.  Eventually, I would like to post binaries through
> Maven for easy integration by users.  RPMs / Debs could be interesting
> since I use a pattern where the data generators are libraries (to support
> application integration / parallelization by the host framework) but also
> provide CLI drivers for local testing.
> 
> 2.  The idea of using the data generators as part of the smoke tests came
> up.  Since there is concern about making the data generators required, we
> could offer the blueprints (BigPetStore) as optional smoke tests.  Would
> that be a good compromise?
> 
> 3.  How will they be maintained?
> 
> I'll certainly add myself to the maintainers list and will be taking
> responsibility.  I'm happy to have others help as well if anyone wants to
> -- if not, that's cool, too.
> 
> 4. Is anyone interested at all in discussing library APIs and designs?
> What about internal interfaces and such?
> 
> 
> My plan was to add at least one more data generator (weather simulator) to
> bigtop-data-generators in the short term.  However, given the concerns
> raised by Cos (more discussion needed) and Olaf (don't want to force data
> generators on unsuspecting users ;) ), I would like to reach some consensus
> on what people are concerned about and solutions.
> 
> On Thu, Aug 27, 2015 at 12:38 PM, Konstantin Boudnik <[email protected]> wrote:
> 
>> Fine by me. I have linked this thread to the JIRA ticket that RJ created,
>> so
>> we have a way to connect one to another ;)
>> 
>>> On Thu, Aug 27, 2015 at 01:02PM, Olaf Flebbe wrote:
>>> Hi,
>>> 
>>> I am not confident that moving important design discussions with impact
>> to
>>> the whole project to jira is a good idea.
>>> 
>>> In the current JIRA Traffic storm it is not easy to identify and follow
>> important tickets.
>>> 
>>> Please keep discussions on the list or at least, please state on this
>> list which Ticket to follow ...
>>> 
>>> Olaf
>>> 
>>> 
>>> 
>>>> Am 26.08.2015 um 22:56 schrieb Konstantin Boudnik <[email protected]>:
>>>> 
>>>> On Wed, Aug 26, 2015 at 10:38PM, Olaf Flebbe wrote:
>>>>> Hi,
>>>>> 
>>>>> Nive to have data generators in Bigtop.
>>>>> 
>>>>> But please do not include it in bigtop_utils, since this package is
>>>>> mandatory. Not everyone needs a data generator .
>>>> 
>>>> Yup. And let's move further design discussion to the JIRA!
>>>> 
>>>>> Olaf
>>>>> 
>>>>> 
>>>>>> Am 26.08.2015 um 11:25 schrieb Jay Vyas <[email protected]
>>> :
>>>>>> 
>>>>>> Publishing the jar to bigtops maven is probably a good first step
>> ,Then apps can just include it as needed...?.
>>>>>> 
>>>>>> I'm not against packaging if someone wants packages for this.  Maybe
>> even include it in bigtop util ?
>>>>>> 
>>>>>> Let's move to jira,
>>>>>> 
>>>>>>> On Aug 25, 2015, at 9:41 PM, Konstantin Boudnik <[email protected]>
>> wrote:
>>>>>>> 
>>>>>>> It is pretty cool indeed!
>>>>>>> 
>>>>>>> I wonder how it needs to be structured to be:
>>>>>>> - easy to access/use from other components wherever it is needed
>>>>>>> - doesn't interfere with the rest of the stack
>>>>>>> 
>>>>>>> I guess one possible way would be to implement the generator as a
>> set of maven
>>>>>>> artifacts, that could be installed/consumed transparently by just
>> declaring a
>>>>>>> dependency e.g as proposed via top-level component.
>>>>>>> 
>>>>>>> Another way is to have a new package like we do for bigtop-utils
>> and such.
>>>>>>> 
>>>>>>> Perhaps this discussion should be moved to JIRA or shall we
>> continue on the
>>>>>>> dev@ ??
>>>>>>> 
>>>>>>> Cos
>>>>>>> 
>>>>>>>> On Sun, Aug 23, 2015 at 11:53AM, RJ Nowling wrote:
>>>>>>>> Hi BigTop,
>>>>>>>> 
>>>>>>>> I had a discussion with Jay yesterday, we'd like to propose a new
>> component
>>>>>>>> for BigTop: BigTop Data Generators.
>>>>>>>> 
>>>>>>>> BigTop Data Generators would consist of a common set of libraries
>> for
>>>>>>>> building data generators and three example data generators:
>>>>>>>> 
>>>>>>>> * BigPetStore transaction generator (moved from BigPetStore)
>>>>>>>> * BigTop Bazaar -- attendee movement and interactions with booths
>> on a
>>>>>>>> showroom floor, at a conference, or at a mall
>>>>>>>> * BigTop Weatherman -- stochastic weather simulation
>> (temperature, wind
>>>>>>>> speed, wind chill, rainfall, etc.) per zip code.  (From a model
>> trained on
>>>>>>>> NOAA historical weather data)
>>>>>>>> 
>>>>>>>> We believe that creating a common set of libraries will have
>> several
>>>>>>>> benefits including:
>>>>>>>> 
>>>>>>>>  * Easier for others to build their own data generators
>>>>>>>>  * Make data generators smaller and easier to maintain
>>>>>>>>  * Share improvements across the data generators
>>>>>>>> 
>>>>>>>> More details on the libraries are below.
>>>>>>>> 
>>>>>>>> BigPetStore will be continue to focus on building  and maintaining
>>>>>>>> blueprints, powered by the BigTop Data Generators.
>>>>>>>> 
>>>>>>>> Our vision is that we get all of Apache coming to BigTop for tools
>> for
>>>>>>>> building better, more comprehensive blueprints.  We want to
>> support these
>>>>>>>> efforts through data generators and the initial set of blueprint
>> we've been
>>>>>>>> building.
>>>>>>>> 
>>>>>>>> If the community is generally in support of this, I can create a
>> top-level
>>>>>>>> "bigtop-data-generators" directory and put the data generators and
>>>>>>>> libraries in there.
>>>>>>>> 
>>>>>>>> Thanks!
>>>>>>>> 
>>>>>>>> RJ
>>>>>>>> 
>>>>>>>> 
>>>>>>>> -------
>>>>>>>> Library details:
>>>>>>>> 
>>>>>>>> So far, I've extracted the following common libraries:
>>>>>>>> 
>>>>>>>>  * Samplers -- provides classes for PDFs and various samplers
>>>>>>>>  * Name generator -- data set and samplers for generating names
>>>>>>>>  * Location data set -- data set and classes for US zip codes,
>> their
>>>>>>>> GPS coordinates, median house hold incomes, and population sizes
>>>>>>>>  * Product generator -- library for enumerating products from a
>>>>>>>> specification file.  Comes with default specifications for
>> BigPetStore
>>>>>>>> 
>>>>>>>> I also expect that I'll add libraries for:
>>>>>>>> 
>>>>>>>>   * Particle simulation -- customer movement in a room
>>>>>>>>   * Latent factor model generation -- generate latent factors and
>>>>>>>> customer weights to create something like MovieLens data.  Used in
>> Bazaar
>>>>>>>> for booth preferences and potentially in BigPetStore for customer
>> item
>>>>>>>> preferences
>>>>>>>> 
>>>>>>>> Most of these libraries came out of the BigPetStore data generator
>> but the
>>>>>>>> other generators have been refactored to be based off the standard
>> set of
>>>>>>>> libraries.
>> 
>> 
>> 

Reply via email to