Re: Running Calcite integration tests in docker

2018-05-15 Thread Michael Mior
I hadn't seen Testcontainers before. That looks pretty cool and I think it
would be a great idea to explore. It seems like having a stable test
infrastructure in Docker like Francis has been working on is a good first
step.

--
Michael Mior
mm...@uwaterloo.ca


Le mar. 15 mai 2018 à 17:26, Kevin Risden  a écrit :

> One idea I've been throwing around in my head is
> https://www.testcontainers.org/ where it would let you run Docker
> containers within unit tests. Again only an idea at this point but could
> move the actual integration tests back into Calcite instead of it being
> completely separate?
>
> Kevin Risden
>
> On Thu, May 3, 2018 at 1:32 AM, Francis Chuang 
> wrote:
>
> > Hey Christina,
> >
> > Thanks for clarifying! I think it's possible to ask infrastructure set up
> > a docker repository for us, however, it also means that it's something we
> > need to maintain vs using the official geode images. I am not sure what
> the
> > policy is around releasing an artifact and docker image for testing that
> is
> > not part of an "official release", though.
> >
> > Ideally, it would be better if GEODE-3971 is fixed to avoid this problem.
> > How long does the spring app take to download dependencies and compile?
> If
> > it's not too long, perhaps we can use the maven docker image and compile
> > the spring app when we run the integration tests.
> >
> > Francis
> >
> >
> > On 3/05/2018 3:37 PM, Christian Tzolov wrote:
> >
> >> Hi Francis,
> >>
> >> Regarding Geode, initially i have tried to to ingest the test data via
> the
> >> Geode's REST/JSON endpoint but bumped into this bug:
> >> https://issues.apache.org/jira/browse/GEODE-3971 (still unresolved).
> >>
> >> As consequence i had to write a custom ingestion using Geode Java API.
> >> But
> >> since i had to compile and run this tool, it might no much sense to
> deploy
> >> and maintain the geode cluster via scripts (or docker image).  Instead
> it
> >> is simpler to embed the entire Geode cluster into the same (standalone)
> >> application. Because this is a Spring Boot app if we deploy the
> pre-build
> >> jar to a public maven repo we can easy create a Docker image that runs
> it
> >> in one line (e.g. java -jar ./)
> >>
> >> Any ideas where we can host this project and where to release it?
> >>
> >> Cheers,
> >> Christian
> >>
> >>
> >> On 23 April 2018 at 14:12, Francis Chuang 
> >> wrote:
> >>
> >> Thanks, Michael!
> >>>
> >>> I noticed that I forgot the link to my fork in my original message.
> Here
> >>> is my fork if someone wants to hack on it a bit more:
> >>>
> https://github.com/Boostport/calcite-test-dataset/tree/switch-to-docker
> >>>
> >>>
> >>> On 23/04/2018 9:58 PM, Michael Mior wrote:
> >>>
> >>> Thanks for raising this Francis. I was hoping to find more time to
> spend
>  on
>  this but unfortunately that hasn't happened.
> 
>  1. That's a question for Christian Tzolov. I'm not too familiar with
>  Geode.
>  2. You are correct that the VM contains several different database
>  servers
>  with various ports exposed. I'm not sure what the situation is with
>  HSQLDB/H2.
>  3. Maven is definitely not strictly necessary although some of the
>  dependencies currently pull in datasets that are used for some of the
>  DBs
>  before building the VM.
>  4.  I don't really have a strong preference either way. I'm sure
> someone
>  else can speak to why this was separated in the first place.
> 
> 
>  --
>  Michael Mior
>  mm...@uwaterloo.ca
> 
>  2018-04-23 7:11 GMT-04:00 Francis Chuang :
> 
>  There is currently an issue open for this in the calcite-test-dataset
> 
> > repository[1], however, I would like to hear more from the wider
> > community
> > regarding this.
> >
> > I have created a `switch-to-docker` branch on my fork and committed a
> > docker-compose.yml under the docker folder, but ran into a few
> > roadblocks
> > and didn't have any more time to investigate.
> >
> > I am currently investigating using docker-composer to orchestrate and
> > set
> > up the containers.
> >
> > Questions:
> >
> > 1. I am not very familiar with Apache Geode. I was able to start the
> > server and locator using the official docker image, but there does
> not
> > appear to be anyway to import data. In the current repository,
> there's
> > some
> > java code in `geode-standalone-cluster`. Why do/did we need to write
> > Java
> > code to stand up a geode cluster? Does anyone know if there are any
> > standalone tools (preferably something with built binaries) that we
> can
> > use
> > to directly ingest the JSON data?
> >
> > 2. From my reading of the integration test instructions[2], the
> > calcite-test-dataset spins up a VM with databases 

Re: Running Calcite integration tests in docker

2018-05-15 Thread Kevin Risden
One idea I've been throwing around in my head is
https://www.testcontainers.org/ where it would let you run Docker
containers within unit tests. Again only an idea at this point but could
move the actual integration tests back into Calcite instead of it being
completely separate?

Kevin Risden

On Thu, May 3, 2018 at 1:32 AM, Francis Chuang 
wrote:

> Hey Christina,
>
> Thanks for clarifying! I think it's possible to ask infrastructure set up
> a docker repository for us, however, it also means that it's something we
> need to maintain vs using the official geode images. I am not sure what the
> policy is around releasing an artifact and docker image for testing that is
> not part of an "official release", though.
>
> Ideally, it would be better if GEODE-3971 is fixed to avoid this problem.
> How long does the spring app take to download dependencies and compile? If
> it's not too long, perhaps we can use the maven docker image and compile
> the spring app when we run the integration tests.
>
> Francis
>
>
> On 3/05/2018 3:37 PM, Christian Tzolov wrote:
>
>> Hi Francis,
>>
>> Regarding Geode, initially i have tried to to ingest the test data via the
>> Geode's REST/JSON endpoint but bumped into this bug:
>> https://issues.apache.org/jira/browse/GEODE-3971 (still unresolved).
>>
>> As consequence i had to write a custom ingestion using Geode Java API.
>> But
>> since i had to compile and run this tool, it might no much sense to deploy
>> and maintain the geode cluster via scripts (or docker image).  Instead it
>> is simpler to embed the entire Geode cluster into the same (standalone)
>> application. Because this is a Spring Boot app if we deploy the pre-build
>> jar to a public maven repo we can easy create a Docker image that runs it
>> in one line (e.g. java -jar ./)
>>
>> Any ideas where we can host this project and where to release it?
>>
>> Cheers,
>> Christian
>>
>>
>> On 23 April 2018 at 14:12, Francis Chuang 
>> wrote:
>>
>> Thanks, Michael!
>>>
>>> I noticed that I forgot the link to my fork in my original message. Here
>>> is my fork if someone wants to hack on it a bit more:
>>> https://github.com/Boostport/calcite-test-dataset/tree/switch-to-docker
>>>
>>>
>>> On 23/04/2018 9:58 PM, Michael Mior wrote:
>>>
>>> Thanks for raising this Francis. I was hoping to find more time to spend
 on
 this but unfortunately that hasn't happened.

 1. That's a question for Christian Tzolov. I'm not too familiar with
 Geode.
 2. You are correct that the VM contains several different database
 servers
 with various ports exposed. I'm not sure what the situation is with
 HSQLDB/H2.
 3. Maven is definitely not strictly necessary although some of the
 dependencies currently pull in datasets that are used for some of the
 DBs
 before building the VM.
 4.  I don't really have a strong preference either way. I'm sure someone
 else can speak to why this was separated in the first place.


 --
 Michael Mior
 mm...@uwaterloo.ca

 2018-04-23 7:11 GMT-04:00 Francis Chuang :

 There is currently an issue open for this in the calcite-test-dataset

> repository[1], however, I would like to hear more from the wider
> community
> regarding this.
>
> I have created a `switch-to-docker` branch on my fork and committed a
> docker-compose.yml under the docker folder, but ran into a few
> roadblocks
> and didn't have any more time to investigate.
>
> I am currently investigating using docker-composer to orchestrate and
> set
> up the containers.
>
> Questions:
>
> 1. I am not very familiar with Apache Geode. I was able to start the
> server and locator using the official docker image, but there does not
> appear to be anyway to import data. In the current repository, there's
> some
> java code in `geode-standalone-cluster`. Why do/did we need to write
> Java
> code to stand up a geode cluster? Does anyone know if there are any
> standalone tools (preferably something with built binaries) that we can
> use
> to directly ingest the JSON data?
>
> 2. From my reading of the integration test instructions[2], the
> calcite-test-dataset spins up a VM with databases preloaded with data
> which
> the main calcite repository runs tests against. HSQLDB and H2 does not
> have
> any open ports in the VM that's spun up. How does does Calcite run
> tests
> against HSQLDB and H2?
>
> 3. What is the role of maven in the calcite-test-dataset repository? I
> see
> a lot of POMs in various subfolders such as mysql, postgresql, etc.
> However, I am not sure what these do. If maven is used to spin up the
> VM,
> perhaps we could remove the dependency on it and just run a
> `docker-compose
> up` to start the 

Re: Running Calcite integration tests in docker

2018-05-03 Thread Francis Chuang

Hey Christina,

Thanks for clarifying! I think it's possible to ask infrastructure set 
up a docker repository for us, however, it also means that it's 
something we need to maintain vs using the official geode images. I am 
not sure what the policy is around releasing an artifact and docker 
image for testing that is not part of an "official release", though.


Ideally, it would be better if GEODE-3971 is fixed to avoid this 
problem. How long does the spring app take to download dependencies and 
compile? If it's not too long, perhaps we can use the maven docker image 
and compile the spring app when we run the integration tests.


Francis

On 3/05/2018 3:37 PM, Christian Tzolov wrote:

Hi Francis,

Regarding Geode, initially i have tried to to ingest the test data via the
Geode's REST/JSON endpoint but bumped into this bug:
https://issues.apache.org/jira/browse/GEODE-3971 (still unresolved).

As consequence i had to write a custom ingestion using Geode Java API.  But
since i had to compile and run this tool, it might no much sense to deploy
and maintain the geode cluster via scripts (or docker image).  Instead it
is simpler to embed the entire Geode cluster into the same (standalone)
application. Because this is a Spring Boot app if we deploy the pre-build
jar to a public maven repo we can easy create a Docker image that runs it
in one line (e.g. java -jar ./)

Any ideas where we can host this project and where to release it?

Cheers,
Christian


On 23 April 2018 at 14:12, Francis Chuang  wrote:


Thanks, Michael!

I noticed that I forgot the link to my fork in my original message. Here
is my fork if someone wants to hack on it a bit more:
https://github.com/Boostport/calcite-test-dataset/tree/switch-to-docker


On 23/04/2018 9:58 PM, Michael Mior wrote:


Thanks for raising this Francis. I was hoping to find more time to spend
on
this but unfortunately that hasn't happened.

1. That's a question for Christian Tzolov. I'm not too familiar with
Geode.
2. You are correct that the VM contains several different database servers
with various ports exposed. I'm not sure what the situation is with
HSQLDB/H2.
3. Maven is definitely not strictly necessary although some of the
dependencies currently pull in datasets that are used for some of the DBs
before building the VM.
4.  I don't really have a strong preference either way. I'm sure someone
else can speak to why this was separated in the first place.


--
Michael Mior
mm...@uwaterloo.ca

2018-04-23 7:11 GMT-04:00 Francis Chuang :

There is currently an issue open for this in the calcite-test-dataset

repository[1], however, I would like to hear more from the wider
community
regarding this.

I have created a `switch-to-docker` branch on my fork and committed a
docker-compose.yml under the docker folder, but ran into a few roadblocks
and didn't have any more time to investigate.

I am currently investigating using docker-composer to orchestrate and set
up the containers.

Questions:

1. I am not very familiar with Apache Geode. I was able to start the
server and locator using the official docker image, but there does not
appear to be anyway to import data. In the current repository, there's
some
java code in `geode-standalone-cluster`. Why do/did we need to write Java
code to stand up a geode cluster? Does anyone know if there are any
standalone tools (preferably something with built binaries) that we can
use
to directly ingest the JSON data?

2. From my reading of the integration test instructions[2], the
calcite-test-dataset spins up a VM with databases preloaded with data
which
the main calcite repository runs tests against. HSQLDB and H2 does not
have
any open ports in the VM that's spun up. How does does Calcite run tests
against HSQLDB and H2?

3. What is the role of maven in the calcite-test-dataset repository? I
see
a lot of POMs in various subfolders such as mysql, postgresql, etc.
However, I am not sure what these do. If maven is used to spin up the VM,
perhaps we could remove the dependency on it and just run a
`docker-compose
up` to start the network of containers.

4. Is there any interest in bringing the contents of calcite-test-dataset
directly into the Calcite repo? The repo zips up to 1.5MB, so it might
not
bring to much bloat to the Calcite repo.

Francis

[1] https://github.com/vlsi/calcite-test-dataset/issues/8

[2] https://calcite.apache.org/docs/howto.html#running-integration-tests









Re: Running Calcite integration tests in docker

2018-04-23 Thread Francis Chuang

Thanks, Michael!

I noticed that I forgot the link to my fork in my original message. Here 
is my fork if someone wants to hack on it a bit more: 
https://github.com/Boostport/calcite-test-dataset/tree/switch-to-docker


On 23/04/2018 9:58 PM, Michael Mior wrote:

Thanks for raising this Francis. I was hoping to find more time to spend on
this but unfortunately that hasn't happened.

1. That's a question for Christian Tzolov. I'm not too familiar with Geode.
2. You are correct that the VM contains several different database servers
with various ports exposed. I'm not sure what the situation is with
HSQLDB/H2.
3. Maven is definitely not strictly necessary although some of the
dependencies currently pull in datasets that are used for some of the DBs
before building the VM.
4.  I don't really have a strong preference either way. I'm sure someone
else can speak to why this was separated in the first place.


--
Michael Mior
mm...@uwaterloo.ca

2018-04-23 7:11 GMT-04:00 Francis Chuang :


There is currently an issue open for this in the calcite-test-dataset
repository[1], however, I would like to hear more from the wider community
regarding this.

I have created a `switch-to-docker` branch on my fork and committed a
docker-compose.yml under the docker folder, but ran into a few roadblocks
and didn't have any more time to investigate.

I am currently investigating using docker-composer to orchestrate and set
up the containers.

Questions:

1. I am not very familiar with Apache Geode. I was able to start the
server and locator using the official docker image, but there does not
appear to be anyway to import data. In the current repository, there's some
java code in `geode-standalone-cluster`. Why do/did we need to write Java
code to stand up a geode cluster? Does anyone know if there are any
standalone tools (preferably something with built binaries) that we can use
to directly ingest the JSON data?

2. From my reading of the integration test instructions[2], the
calcite-test-dataset spins up a VM with databases preloaded with data which
the main calcite repository runs tests against. HSQLDB and H2 does not have
any open ports in the VM that's spun up. How does does Calcite run tests
against HSQLDB and H2?

3. What is the role of maven in the calcite-test-dataset repository? I see
a lot of POMs in various subfolders such as mysql, postgresql, etc.
However, I am not sure what these do. If maven is used to spin up the VM,
perhaps we could remove the dependency on it and just run a `docker-compose
up` to start the network of containers.

4. Is there any interest in bringing the contents of calcite-test-dataset
directly into the Calcite repo? The repo zips up to 1.5MB, so it might not
bring to much bloat to the Calcite repo.

Francis

[1] https://github.com/vlsi/calcite-test-dataset/issues/8

[2] https://calcite.apache.org/docs/howto.html#running-integration-tests






Re: Running Calcite integration tests in docker

2018-04-23 Thread Michael Mior
Thanks for raising this Francis. I was hoping to find more time to spend on
this but unfortunately that hasn't happened.

1. That's a question for Christian Tzolov. I'm not too familiar with Geode.
2. You are correct that the VM contains several different database servers
with various ports exposed. I'm not sure what the situation is with
HSQLDB/H2.
3. Maven is definitely not strictly necessary although some of the
dependencies currently pull in datasets that are used for some of the DBs
before building the VM.
4.  I don't really have a strong preference either way. I'm sure someone
else can speak to why this was separated in the first place.


--
Michael Mior
mm...@uwaterloo.ca

2018-04-23 7:11 GMT-04:00 Francis Chuang :

> There is currently an issue open for this in the calcite-test-dataset
> repository[1], however, I would like to hear more from the wider community
> regarding this.
>
> I have created a `switch-to-docker` branch on my fork and committed a
> docker-compose.yml under the docker folder, but ran into a few roadblocks
> and didn't have any more time to investigate.
>
> I am currently investigating using docker-composer to orchestrate and set
> up the containers.
>
> Questions:
>
> 1. I am not very familiar with Apache Geode. I was able to start the
> server and locator using the official docker image, but there does not
> appear to be anyway to import data. In the current repository, there's some
> java code in `geode-standalone-cluster`. Why do/did we need to write Java
> code to stand up a geode cluster? Does anyone know if there are any
> standalone tools (preferably something with built binaries) that we can use
> to directly ingest the JSON data?
>
> 2. From my reading of the integration test instructions[2], the
> calcite-test-dataset spins up a VM with databases preloaded with data which
> the main calcite repository runs tests against. HSQLDB and H2 does not have
> any open ports in the VM that's spun up. How does does Calcite run tests
> against HSQLDB and H2?
>
> 3. What is the role of maven in the calcite-test-dataset repository? I see
> a lot of POMs in various subfolders such as mysql, postgresql, etc.
> However, I am not sure what these do. If maven is used to spin up the VM,
> perhaps we could remove the dependency on it and just run a `docker-compose
> up` to start the network of containers.
>
> 4. Is there any interest in bringing the contents of calcite-test-dataset
> directly into the Calcite repo? The repo zips up to 1.5MB, so it might not
> bring to much bloat to the Calcite repo.
>
> Francis
>
> [1] https://github.com/vlsi/calcite-test-dataset/issues/8
>
> [2] https://calcite.apache.org/docs/howto.html#running-integration-tests
>
>