[GitHub] incubator-quickstep issue #360: Fix the inclusion guard of ForemanSingleNode...

2018-06-21 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/360
  
LGTM! Merging.


---


[GitHub] incubator-quickstep issue #359: Fixed the build issues regarding tmb benchma...

2018-06-19 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/359
  
LGTM. Merging.


---


[GitHub] incubator-quickstep issue #358: Fix a bug in HashJoinOperator

2018-06-04 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/358
  
Tests added. The problematic code would fail on the added tests.


---


[GitHub] incubator-quickstep issue #355: QUICKSTEP-127 Data provider thread

2018-06-04 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/355
  
@hbdeshmukh Hi Harshad, can you rebase the branch so I can help merge this 
PR.


---


[GitHub] incubator-quickstep issue #357: Fixed the command execution bug in the distr...

2018-06-04 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/357
  
LGTM!


---


[GitHub] incubator-quickstep issue #355: QUICKSTEP-127 Data provider thread

2018-06-04 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/355
  
LGTM!


---


[GitHub] incubator-quickstep pull request #358: Fix a bug in HashJoinOperator

2018-06-03 Thread jianqiao
GitHub user jianqiao opened a pull request:

https://github.com/apache/incubator-quickstep/pull/358

Fix a bug in HashJoinOperator

This PR fixes a bug in `HashJoinOperator` w.r.t. the swapping of 
probe/build sides in a previous PR. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jianqiao/incubator-quickstep fix-filter-side

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-quickstep/pull/358.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #358


commit c4a072f301593b3f92a009ba752c05b2226a0f32
Author: Jianqiao Zhu 
Date:   2018-06-03T20:38:36Z

Fix a bug of filter side in HashJoinOperator




---


[GitHub] incubator-quickstep issue #347: QUICKSTEP-121: Added the self-join support.

2018-05-09 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/347
  
LGTM! Merging.

Note that the `concretize` signature in `Physical` plans looks somehow 
cumbersome, we may add a `SubstitutionContext` class to wrap these in a future 
PR.
```
::quickstep::Predicate* concretize(
const std::unordered_map<ExprId, const CatalogAttribute*> 
_map,
const std::unordered_set _expr_ids = 
std::unordered_set(),
const std::unordered_set _expr_ids = 
std::unordered_set()) const override;
```


---


[GitHub] incubator-quickstep issue #353: Minor bug fixes and refactors.

2018-05-09 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/353
  
LGTM! Merging.


---


[GitHub] incubator-quickstep issue #354: Fixed the union-all elimiation case where so...

2018-05-09 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/354
  
LGTM! Merging.


---


[GitHub] incubator-quickstep issue #351: Use Exactness info in Catalog stats.

2018-05-07 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/351
  
The stats can be used to provide an estimation even when they are not 
exact. 


---


[GitHub] incubator-quickstep issue #350: Fixed the bug regarding EliminateEmptyNode a...

2018-05-04 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/350
  
LGTM! Merging.


---


[GitHub] incubator-quickstep issue #349: Fixed the bug regarding EliminateEmptyNode o...

2018-05-03 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/349
  
LGTM! Merging.


---


[GitHub] incubator-quickstep pull request #346: Add a python script to auto fix CMake...

2018-04-27 Thread jianqiao
GitHub user jianqiao opened a pull request:

https://github.com/apache/incubator-quickstep/pull/346

Add a python script to auto fix CMakeLists files

This PR adds a script that intends to help improve developer productivity 
by automatically fixing `CMakeLists.txt` files for the Quickstep project (with 
best effort).

The script will do the following things:
- Scan the repo's subdirectories and collect `#include` information from 
all source code files.
- Parse existing `CMakeLists.txt` files and convert all "recognized" 
commands into proper intermediate representations -- the "unrecognized" part 
will be kept as "verbal" lines.
- Resolve subdirectories, targets and link dependencies. Add / delete / 
update the corresponding entries.
- Convert the intermediate representations back to `CMakeLists.txt` files.

**NOTE:** Currently the script is at its initial stage and will not update 
tests or conditional targets (i.e. those within cmake `if` commands). It is 
likely to work well if you just create/delete some files or add/remove some 
`#include`'s -- otherwise additional manual fixes may need to be done after 
applying the script.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jianqiao/incubator-quickstep 
autofix-cmake-tool

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-quickstep/pull/346.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #346


commit 73d796dee760a03a91d55cb0fe4d8f073f831237
Author: Jianqiao Zhu <jianqiao@...>
Date:   2018-04-27T22:28:51Z

Add a python script to auto fix CMakeLists files




---


[GitHub] incubator-quickstep issue #344: QUICKSTEP-123: Fixed the missing 'has_repart...

2018-04-27 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/344
  
LGTM! Merging.


---


[GitHub] incubator-quickstep pull request #343: Fix all CMakeLists.txt for automated ...

2018-04-26 Thread jianqiao
GitHub user jianqiao opened a pull request:

https://github.com/apache/incubator-quickstep/pull/343

Fix all CMakeLists.txt for automated processing

This PR fixes and adjusts the style of all `CMakeLists.txt` so that they 
become stable (i.e. well-formatted) to be processed by an automated tool.

The above mentioned tool will be proposed in a subsequent PR. It is 
intended to help improve developer productivity as it scans source code 
dependencies and automatically fixes cmakelists.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jianqiao/incubator-quickstep autofix-cmake

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-quickstep/pull/343.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #343


commit 3169b5646aecae35485a76f4439a121d9f66b3e2
Author: Jianqiao Zhu <jianqiao@...>
Date:   2018-04-18T05:54:31Z

Fix and rearrange all CMakeLists.txt so that they are ready to be processed 
and regenrated by an automation tool.




---


[GitHub] incubator-quickstep issue #342: Quickstep-119: Added the rule that eliminate...

2018-04-26 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/342
  
LGTM! Merging.


---


[GitHub] incubator-quickstep issue #340: More informative error for BlockNotFound exc...

2018-04-18 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/340
  
LGTM! Merging.


---


[GitHub] incubator-quickstep pull request #340: More informative error for BlockNotFo...

2018-04-17 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/340#discussion_r182212888
  
--- Diff: storage/StorageErrors.hpp ---
@@ -61,9 +61,16 @@ class BlockMemoryTooSmall : public std::exception {
  **/
 class BlockNotFoundInMemory : public std::exception {
  public:
+  BlockNotFoundInMemory(int block_id) : block_id_(block_id) {}
+
   virtual const char* what() const throw() {
-return "BlockNotFoundInMemory: The specified block was not found in 
memory";
+std::string message = "BlockNotFoundInMemory: The specified block with 
ID "
+  + std::to_string(block_id_ )+ " was not found in memory";
+return message.c_str();
   }
+
+ private:
+  int block_id_;
--- End diff --

Suggested fix:
```
  const std::string block_id_message_;
```


---


[GitHub] incubator-quickstep pull request #340: More informative error for BlockNotFo...

2018-04-17 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/340#discussion_r182210127
  
--- Diff: storage/StorageErrors.hpp ---
@@ -61,9 +61,16 @@ class BlockMemoryTooSmall : public std::exception {
  **/
 class BlockNotFoundInMemory : public std::exception {
  public:
+  BlockNotFoundInMemory(int block_id) : block_id_(block_id) {}
--- End diff --

Minor style fix:
```
explicit BlockNotFoundInMemory(const int block_id) : ...
```


---


[GitHub] incubator-quickstep issue #339: Upgrade cmake version.

2018-04-10 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/339
  
LGTM! Merging.


---


[GitHub] incubator-quickstep pull request #334: Fix iwyu include path

2018-02-26 Thread jianqiao
GitHub user jianqiao opened a pull request:

https://github.com/apache/incubator-quickstep/pull/334

Fix iwyu include path

This PR fixes the third-party library include paths for the iwyu 
(include-what-you-use) tool.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apache/incubator-quickstep fix-iwyu

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-quickstep/pull/334.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #334


commit c2ed5c69b6b8dad07d7410beb0c8292ea1a746e0
Author: Jianqiao Zhu <jianqiao@...>
Date:   2017-09-01T20:07:41Z

Fix iwyu include path




---


[GitHub] incubator-quickstep pull request #332: Small adjustments in star schema cost...

2018-02-23 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/332#discussion_r170369358
  
--- Diff: query_optimizer/cost_model/StarSchemaSimpleCostModel.cpp ---
@@ -493,7 +493,7 @@ std::size_t 
StarSchemaSimpleCostModel::getNumDistinctValues(
   return stat.getNumDistinctValues(rel_attr_id);
 }
   }
-  return estimateCardinalityForTableReference(table_reference);
+  return estimateCardinalityForTableReference(table_reference) * 0.1;
--- End diff --

This estimation ratio can be any decimal number that is not close to `1` -- 
in that case the optimizer would choose bad plans in some situations as the 
column appears to have "unique" values.

`0.1` tends to be a reasonable choice -- we may also have `0.05`, `0.2`, 
etc., which can be adjusted later when there are actual demands.



---


[GitHub] incubator-quickstep pull request #333: Fix SeparateChainingHashTable::resize...

2018-02-18 Thread jianqiao
GitHub user jianqiao opened a pull request:

https://github.com/apache/incubator-quickstep/pull/333

Fix SeparateChainingHashTable::resize()

This PR fixes the problem that Quickstep hangs when resizing 
`SeparateChainingHashTable` during the execution of `BuildHashOperator`.

Here is a sequence of queries that reproduce the problem:
```
CREATE TABLE r(x INT, y INT);
CREATE TABLE s(x INT, y INT);
CREATE TABLE t(x INT, y INT);

INSERT INTO r SELECT 1, 1 FROM generate_series(1, 200) AS g(x);
INSERT INTO s SELECT 1, 1 FROM generate_series(1, 200) AS g(x);
INSERT INTO t SELECT 1, 1 FROM generate_series(1, 1000) AS g(x);

\analyze

SELECT COUNT(*) FROM r, s, t WHERE r.x = s.x AND r.y = s.y AND s.x = t.x 
AND s.y = t.y;
```

The problem is caused by the [`resize()` 
call](https://github.com/apache/incubator-quickstep/blob/master/storage/HashTable.hpp#L1514)
 in `HashTable::putValueAccessorCompositeKey()` when `using_prealloc` is true. 
In this case, pre-allocation decides to resize the hash table in order to 
consume all the tuples from the current value accessor. However, `resize()` 
will alway abort if the hash table is not "actually full", causing infinite 
loops.

Note that `SimpleScalarSeparateChainingHashTable` does not have the same 
problem, as its [`isFull` 
method](https://github.com/apache/incubator-quickstep/blob/master/storage/SimpleScalarSeparateChainingHashTable.hpp#L241)
 already takes `extra_buckets` into consideration.

Also note that `LinearOpenAddressingHashTable` seems to have avoided the 
hanging problem by using a [`retry_num` 
check](https://github.com/apache/incubator-quickstep/blob/master/storage/LinearOpenAddressingHashTable.hpp#L1203).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jianqiao/incubator-quickstep fix-hash-resize

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-quickstep/pull/333.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #333


commit d1dbb0d9bc2d1f001deee4039157b0be464870f4
Author: Jianqiao Zhu <jianqiao@...>
Date:   2018-02-18T07:16:07Z

Fix the hanging problem of SeparateChainingHashTable::resize()




---


[GitHub] incubator-quickstep pull request #332: Small adjustments in star schema cost...

2018-02-07 Thread jianqiao
GitHub user jianqiao opened a pull request:

https://github.com/apache/incubator-quickstep/pull/332

Small adjustments in star schema cost model for # distinct values estimation

This PR has a small adjustment in star schema cost model for # of distinct 
values estimation, together with a fix to a potential bug with 
`impliesUniqueAttributes`.

The adjustment is likely to improve query plans _when table stats are not 
present_. It does not affect SSB/TPC-H performance.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jianqiao/incubator-quickstep adjust-cost

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-quickstep/pull/332.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #332


commit 8e94a8e7ef6c99e5c64d1d96bb7283f9f1154116
Author: Jianqiao Zhu <jianqiao@...>
Date:   2018-02-07T21:42:15Z

Small adjust in star schema cost model for # distinct values




---


[GitHub] incubator-quickstep pull request #331: Add a cmake option to handle the Trav...

2018-02-02 Thread jianqiao
GitHub user jianqiao opened a pull request:

https://github.com/apache/incubator-quickstep/pull/331

Add a cmake option to handle the Travis CI timeout problem.

This PR adds a cmake option `ENABLE_COMPARISON_INLINE_EXPANSION` to allow 
disabling of method specialization in various `Comparison`'s.

Turing the flag `OFF` will greatly reduce Quickstep compile time -- thus 
improving development productivity as well as fixing the Travis CI timeout 
problem. Note that the flag is by default `ON`, and will be [turned 
off](https://github.com/apache/incubator-quickstep/blob/539e1ebe09b5d1a2d86069ed1fdc6e9fb38c5ce7/.travis.yml#L80)
 during the Travis test.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apache/incubator-quickstep fix-travis-timeout

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-quickstep/pull/331.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #331


commit 539e1ebe09b5d1a2d86069ed1fdc6e9fb38c5ce7
Author: Jianqiao Zhu <jianqiao@...>
Date:   2018-02-02T23:27:59Z

Add a flag to allow disabling of Comparison inline expansion to enable 
acceleration of Quickstep build.

(for development productivity as well as solving the Travis CI timeout 
problem)




---


[GitHub] incubator-quickstep issue #329: IDE Documentation fixes

2018-01-11 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/329
  
LGTM! Merging.


---


[GitHub] incubator-quickstep issue #330: Upgraded benchmark third party library.

2018-01-11 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/330
  
LGTM! Merging.


---


[GitHub] incubator-quickstep issue #319: Fixed the bug when partition w/ pruned colum...

2017-12-21 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/319
  
LGTM! Merging.


---


Re: Support for varchar(max)

2017-12-04 Thread Jianqiao
There is no TEXT type yet and it would be good to add the type.

Also there are two issues to be improved for VARCHAR:
(A) The current varchar is restricted to be within a storage block, i.e.
2MB by default configuration.
(B) For varchar with relative large size (e.g. varchar(8192)), a storage
block will just be partially filled then mark full -- due to some
reservation check during bulk insert -- thus wasting storage space.


2017-12-04 17:20 GMT-06:00 Dylan Bacon :

> Varchar(MAX)/TEXT is a construct that lets you put in an arbitrary amount
> of text into the string with no defined upper limit unlike varchar(#) where
> # is the character limit. It's a small technical limitation to a project
> I'm working on if we don't have it but it's easy enough to work around, was
> mostly curious if we have that support. I'm shoving email bodies into QS
> and having arbitrary text would make that more powerful.
>
>
>
> On 12/4/17 5:17 PM, Robert Claus wrote:
>
>> I've used varchar successfully in Quickstep, but I don't know what
>> functions are supported.  Is there specific functionality you're looking
>> for?
>>
>> Ex. "CREATE TABLE Child (a int, b int, c varchar(20));"
>>
>> -Robert
>>
>> On Mon, Dec 4, 2017 at 5:01 PM, Dylan Bacon  wrote:
>>
>> Hello,
>>>
>>> Does Quickstep currently have support for arbitrary-length BLOB format
>>> varchars? Think TEXT or varchar(MAX) from SQL Server.
>>>
>>> --
>>> Regards,
>>>
>>> Dylan Bacon
>>> University of Wisconsin - Madison
>>> Department of Computer Sciences
>>> dba...@wisc.edu
>>>
>>>
>>>
>


[GitHub] incubator-quickstep issue #326: QUICKSTEP-112 Get the list of referenced bas...

2017-12-01 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/326
  
LGTM! Merging.


---


Re: Quickstep Network Mode and C++ Sockets

2017-11-30 Thread Jianqiao
Yes. NetworkCliClient is quite standalone. The dependencies are:

(1) #include <grpc++/grpc++.h>
(2) #include "cli/NetworkCli.grpc.pb.h"
(3) #include "cli/NetworkCli.pb.h"
(4) #include "utility/Macros.hpp"


To write your own client:

(A) Setup grpc so that you can include the header files and link to it.

(B) Grab NetworkCli.proto (https://github.com/apache/
incubator-quickstep/blob/master/cli/NetworkCli.proto), change the package
name if necessary (originally quickstep).
Either
(B.1) use grpc/protobuf tools to compile NetworkCli.proto to generate (2)
and (3) -- see https://github.com/apache/incubator-quickstep/blob/
master/cli/CMakeLists.txt#L53
or
(B.2) compile quickstep and grab the files from build/cli/

(C) Copy the NetworkCliClient class from QS into your client code.


(A)/(B) may be somehow annoying to handle as you need to search through
various documentations ...


Best,
Jianqiao

2017-11-30 17:07 GMT-06:00 Dylan Bacon <dba...@wisc.edu>:

> So NetworkCliClient should be something I'm able to include in my program
> along with the appropriate dependencies and use as the API? I was thinking
> about needing to do that but I wasn't sure if that was a standalone API QS
> has implemented or a core part of the system. Unless I'm being mistaken and
> you're talking about something from gRPC. This is my first time working
> with it.
>
>
>
> On 11/30/17 4:58 PM, Jianqiao wrote:
>
>> Hi Dylan,
>>
>> Currently the network mode is using gRPC, so you probably need to use the
>> corresponding API (see
>> https://github.com/apache/incubator-quickstep/blob/master/
>> cli/NetworkCliClientMain.cpp#L42
>> as an example). The raw socket connection won't work unless you hack
>> gRPC's
>> message exchange protocol ..
>>
>> Best,
>> Jianqiao
>>
>> 2017-11-30 16:49 GMT-06:00 Dylan Bacon <dba...@wisc.edu>:
>>
>> Hello,
>>>
>>> I am attempting to interface with Quickstep using its NetworkCliClient
>>> and
>>> it's not working as I would expect. I have the default port and IP set to
>>> 3000 and 0.0.0.0 and am attempting to send single queries to be processed
>>> over in my test harness. From what I could tell of the code when QS is in
>>> network mode it accepts a socket connection and string input from that
>>> function and processes it in NetworkCliClient.hpp and
>>> NetworkCliClientMain.cpp, and yet this is not happening with my test
>>> code.
>>> The connection is being established but Quickstep does not seem to be
>>> doing
>>> anything with the queries that come in.
>>>
>>> Attached is the test code that I am using. test is just a table by that
>>> name, I'm selecting a literal from it so the contents shouldn't matter.
>>> I've also attempted to create a table with this but Quickstep did not
>>> process that.
>>>
>>> --
>>> Regards,
>>>
>>> Dylan Bacon
>>> University of Wisconsin - Madison
>>> Department of Computer Sciences
>>> dba...@wisc.edu
>>>
>>>
>>>
>


[GitHub] incubator-quickstep pull request #326: QUICKSTEP-112 Get the list of referen...

2017-11-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/326#discussion_r154230379
  
--- Diff: query_optimizer/rules/ReferencedBaseRelations.hpp ---
@@ -0,0 +1,78 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ **/
+
+#ifndef QUICKSTEP_QUERY_OPTIMIZER_RULES_REFERENCED_BASE_RELATIONS_HPP_
+#define QUICKSTEP_QUERY_OPTIMIZER_RULES_REFERENCED_BASE_RELATIONS_HPP_
+
+#include 
+#include 
+
+#include "catalog/CatalogTypedefs.hpp"
+#include "query_optimizer/logical/Logical.hpp"
+#include "query_optimizer/rules/DFSTraversal.hpp"
+#include "utility/Macros.hpp"
+
+namespace quickstep {
+
+class CatalogRelation;
+
+namespace optimizer {
+
+class OptimizerContext;
+
+class ReferencedBaseRelations : public DFSTraversal {
--- End diff --

Since this class overrides the `apply` method, we can just inherit 
`Rule`.


---


[GitHub] incubator-quickstep pull request #326: QUICKSTEP-112 Get the list of referen...

2017-11-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/326#discussion_r154229622
  
--- Diff: query_optimizer/rules/ReferencedBaseRelations.hpp ---
@@ -0,0 +1,78 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ **/
+
+#ifndef QUICKSTEP_QUERY_OPTIMIZER_RULES_REFERENCED_BASE_RELATIONS_HPP_
+#define QUICKSTEP_QUERY_OPTIMIZER_RULES_REFERENCED_BASE_RELATIONS_HPP_
+
+#include 
+#include 
+
+#include "catalog/CatalogTypedefs.hpp"
+#include "query_optimizer/logical/Logical.hpp"
+#include "query_optimizer/rules/DFSTraversal.hpp"
+#include "utility/Macros.hpp"
+
+namespace quickstep {
+
+class CatalogRelation;
+
+namespace optimizer {
+
+class OptimizerContext;
+
+class ReferencedBaseRelations : public DFSTraversal {
+ public:
+  /**
+   * @brief Constructor
+   * @param optimizer_context The optimizer context.
+   */
+  explicit ReferencedBaseRelations(OptimizerContext *optimizer_context)
+  : optimizer_context_(optimizer_context) {
+  }
+
+  std::string getName() const override { return "ReferencedBaseRelations"; 
}
+
+  TreeNodePtr apply(const TreeNodePtr ) override;
+
+  /**
+   * @brief Get the base relations referenced in a query.
+   */
+  const std::vector getReferencedBaseRelations() 
const {
--- End diff --

Better remove the beginning `const`, as the method returns a temporary 
object.


---


Re: Quickstep Network Mode and C++ Sockets

2017-11-30 Thread Jianqiao
Hi Dylan,

Currently the network mode is using gRPC, so you probably need to use the
corresponding API (see
https://github.com/apache/incubator-quickstep/blob/master/cli/NetworkCliClientMain.cpp#L42
as an example). The raw socket connection won't work unless you hack gRPC's
message exchange protocol ..

Best,
Jianqiao

2017-11-30 16:49 GMT-06:00 Dylan Bacon <dba...@wisc.edu>:

> Hello,
>
> I am attempting to interface with Quickstep using its NetworkCliClient and
> it's not working as I would expect. I have the default port and IP set to
> 3000 and 0.0.0.0 and am attempting to send single queries to be processed
> over in my test harness. From what I could tell of the code when QS is in
> network mode it accepts a socket connection and string input from that
> function and processes it in NetworkCliClient.hpp and
> NetworkCliClientMain.cpp, and yet this is not happening with my test code.
> The connection is being established but Quickstep does not seem to be doing
> anything with the queries that come in.
>
> Attached is the test code that I am using. test is just a table by that
> name, I'm selecting a literal from it so the contents shouldn't matter.
> I've also attempted to create a table with this but Quickstep did not
> process that.
>
> --
> Regards,
>
> Dylan Bacon
> University of Wisconsin - Madison
> Department of Computer Sciences
> dba...@wisc.edu
>
>


Re: problem with build quickstep

2017-11-29 Thread Jianqiao
Hi Song,

It seems to be problem with higher versions of gcc/clang.

As a temporary fix please comment out (or remove) the following two lines
in incubator/CMakeLists.txt and see if it works:
https://github.com/apache/incubator-quickstep/compare/disable-flags

Best,
Jianqiao


2017-11-28 18:57 GMT-06:00 Song Zhao <szha...@wisc.edu>:

> hi Harshad
>
> My cmake version is 3.9.6, CMAKE_CXX_COMPILER:FILEPATH=/usr/bin/c++
>
>
> Thank you,
> Song
>


[GitHub] incubator-quickstep issue #300: QUICKSTEP-106: Hash-Join-Fuse: Feature added...

2017-11-28 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/300
  
@zuyu Currently `HashJoinOperator` performance is sensitive to the _number 
of build blocks per probe block_ due to the concurrency bottleneck within LRU 
policy enforcer.

Consider the situation that the build side relation has `N` blocks and the 
number of blocks decreases to `M` after applying the predicate, where `N` is 
very large but `M` is small. Then materializing the filtered build-side 
relation incurs only a small overhead, but it dramatically reduces the _number 
of build blocks per probe block_.




---


[GitHub] incubator-quickstep issue #300: QUICKSTEP-106: Hash-Join-Fuse: Feature added...

2017-11-28 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/300
  
Merging.


---


[GitHub] incubator-quickstep issue #325: DO NOT MERGE: Concurrent queries transaction...

2017-11-25 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/325
  
The design looks good to me! It would be better to fast forward to 
subsequent PRs to see the actual usage.


---


[GitHub] incubator-quickstep issue #300: QUICKSTEP-106: Hash-Join-Fuse: Feature added...

2017-11-21 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/300
  
@dylanpbacon Hi Dylan, I updated the LIP related stuff and put it in this 
[branch](https://github.com/apache/incubator-quickstep/tree/Hash-Join-Fuse), 
you may just fetch the changes to reset your repo's `Hash-Join-Fuse` branch, 
then `git push -f`, then we can merge this PR.

**Note:**
The optimization is enabled by default, and I added an extra gflag 
`fuse-hash-select-threshold` that is set to one million (`100u`) by 
default. A fusion transformation is applied only when the estimated cardinality 
of the build-side selection is _greater_ than the threshold.

Overall speaking, the fuse-hash-select optimization is especially 
beneficial when _the build-side selection has large output cardinality_ (e.g. 
TPC-H Q21), and the benefits come from two aspects:
(1) smaller memory footprint,
(2) avoiding materialization of the selection's output.

However, due to some issues in current implementation of HashJoinOperator 
(and buffer manager), the fusion may slow down some queries (e.g. TPC-H Q02, 
400ms -> 700ms with LIP fixed). The `fuse-hash-select-threshold` prevents those 
situations.



---


[GitHub] incubator-quickstep issue #321: Fix number of work orders generated for inse...

2017-11-20 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/321
  
LGTM! Merging.



---


[GitHub] incubator-quickstep issue #323: Temporary Build Support for OS X 10.13

2017-11-18 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/323
  
Merged


---


[GitHub] incubator-quickstep issue #323: Temporary Build Support for OS X 10.13

2017-11-18 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/323
  
LGTM! Can you squash the two commits into one and then I can merge this PR.

(To squash the two commits, go to the repo, `git rebase -i HEAD~2`, change 
the second `pick` to `fixup`, then save & exit.)


---


Re: cmake error

2017-11-10 Thread Jianqiao
Hi Om,

As a quick fix you can comment out (by adding '#' in front, or just remove
the line) the following two lines in incubator-quickstep/CMakeLists.txt:

Line 294
<https://github.com/apache/incubator-quickstep/blob/master/CMakeLists.txt#L294>
: set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Werror")

Line 588
<https://github.com/apache/incubator-quickstep/blob/master/CMakeLists.txt#L588>
: set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-return-type-c-linkage")

This should fix the problem for gcc-6.3.. we may add systematic fixes for
different compiler versions later.

Best,
Jianqiao

2017-11-09 18:21 GMT-06:00 Om Jadhav <ojad...@wisc.edu>:

> Hi,
>
> It’s gcc version 6.3.0.
>
> gcc -v
> Using built-in specs.
> COLLECT_GCC=gcc
> COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper
> Target: x86_64-linux-gnu
> Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro
> 6.3.0-18ubuntu2~16.04' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs
> --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr
> --program-suffix=-6 --program-prefix=x86_64-linux-gnu- --enable-shared
> --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext
> --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/
> --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes
> --with-default-libstdcxx-abi=new --enable-gnu-unique-object
> --disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib
> --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo
> --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-amd64/jre
> --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-amd64
> --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-amd64
> --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar
> --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch
> --disable-werror --with-arch-32=i686 --with-abi=m64
> --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
> --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu
> --target=x86_64-linux-gnu
> Thread model: posix
> gcc version 6.3.0 20170519 (Ubuntu/Linaro 6.3.0-18ubuntu2~16.04)
>
>
> On 09/11/17, 5:04 PM, "Jianqiao" <jianq...@apache.org> wrote:
>
> It seems to be a problem related to C++ compiler version. Can you
> check its
> version by using command:
> gcc -v
>
> The fix should be a few lines of changes in root directory's/glog's
> CMakeLists.txt.
>
> Best,
> Jianqiao
>
>
> 2017-11-08 15:30 GMT-06:00 Om Jadhav <ojad...@wisc.edu>:
>
> > Hi Jianqiao,
> >
> > Please find the make error below:
> >
> > [  7%] Completed 'libtcmalloc_ext'
> > [  7%] Built target libtcmalloc_ext
> > [  7%] Building CXX object third_party/googletest/
> > googletest/CMakeFiles/gtest.dir/src/gtest-all.cc.o
> > [  7%] Linking CXX static library libgtest.a
> > [  7%] Built target gtest
> > [  7%] Building CXX object third_party/gflags/CMakeFiles/
> > gflags_nothreads-static.dir/src/gflags.cc.o
> > /home/omjadhav/quickstep/third_party/src/gflags/src/gflags.cc:443:5:
> > error: ‘int google::{anonymous}::FlagValue::ValueSize() const’
> defined
> > but not used [-Werror=unused-function]
> >  int FlagValue::ValueSize() const {
> >  ^
> > cc1plus: error: unrecognized command line option
> > ‘-Wno-return-type-c-linkage’ [-Werror]
> > cc1plus: all warnings being treated as errors
> > third_party/gflags/CMakeFiles/gflags_nothreads-static.dir/
> build.make:62:
> > recipe for target 'third_party/gflags/CMakeFiles/gflags_nothreads-
> static.dir/src/gflags.cc.o'
> > failed
> > make[2]: *** [third_party/gflags/CMakeFiles/gflags_nothreads-
> static.dir/src/gflags.cc.o]
> > Error 1
> > CMakeFiles/Makefile2:939: recipe for target 'third_party/gflags/
> > CMakeFiles/gflags_nothreads-static.dir/all' failed
> > make[1]: *** [third_party/gflags/CMakeFiles/gflags_nothreads-
> static.dir/all]
> > Error 2
> > Makefile:138: recipe for target 'all' failed
> > make: *** [all] Error 2
> >
> >
> > Thanks
> > Om
> >
> > On 06/11/17, 3:45 PM, "Jianqiao" <jianq...@apache.org> wrote:
> >
> > Hi Om,
> >
> > It seems that your "cmake" output is okay. Can you also provide
> the
> > "make"
> > error message?
> >
>

Re: cmake error

2017-11-09 Thread Jianqiao
It seems to be a problem related to C++ compiler version. Can you check its
version by using command:
gcc -v

The fix should be a few lines of changes in root directory's/glog's
CMakeLists.txt.

Best,
Jianqiao


2017-11-08 15:30 GMT-06:00 Om Jadhav <ojad...@wisc.edu>:

> Hi Jianqiao,
>
> Please find the make error below:
>
> [  7%] Completed 'libtcmalloc_ext'
> [  7%] Built target libtcmalloc_ext
> [  7%] Building CXX object third_party/googletest/
> googletest/CMakeFiles/gtest.dir/src/gtest-all.cc.o
> [  7%] Linking CXX static library libgtest.a
> [  7%] Built target gtest
> [  7%] Building CXX object third_party/gflags/CMakeFiles/
> gflags_nothreads-static.dir/src/gflags.cc.o
> /home/omjadhav/quickstep/third_party/src/gflags/src/gflags.cc:443:5:
> error: ‘int google::{anonymous}::FlagValue::ValueSize() const’ defined
> but not used [-Werror=unused-function]
>  int FlagValue::ValueSize() const {
>  ^
> cc1plus: error: unrecognized command line option
> ‘-Wno-return-type-c-linkage’ [-Werror]
> cc1plus: all warnings being treated as errors
> third_party/gflags/CMakeFiles/gflags_nothreads-static.dir/build.make:62:
> recipe for target 
> 'third_party/gflags/CMakeFiles/gflags_nothreads-static.dir/src/gflags.cc.o'
> failed
> make[2]: *** 
> [third_party/gflags/CMakeFiles/gflags_nothreads-static.dir/src/gflags.cc.o]
> Error 1
> CMakeFiles/Makefile2:939: recipe for target 'third_party/gflags/
> CMakeFiles/gflags_nothreads-static.dir/all' failed
> make[1]: *** [third_party/gflags/CMakeFiles/gflags_nothreads-static.dir/all]
> Error 2
> Makefile:138: recipe for target 'all' failed
> make: *** [all] Error 2
>
>
> Thanks
> Om
>
> On 06/11/17, 3:45 PM, "Jianqiao" <jianq...@apache.org> wrote:
>
> Hi Om,
>
> It seems that your "cmake" output is okay. Can you also provide the
> "make"
> error message?
>
> Best,
> Jianqiao
>
> 2017-11-06 11:34 GMT-06:00 Harshad Deshmukh <hars...@cs.wisc.edu>:
>
> > Hi Om,
> >
> > What's your build setup? Did you download the prerequisites and
> > initialized the git submodules?
> >
> > Get Outlook for Android<https://aka.ms/ghei36>
> >
> > 
> > From: Om Jadhav <ojad...@wisc.edu>
> > Sent: Friday, November 3, 2017 3:42:05 PM
> > To: dev@quickstep.incubator.apache.org
> > Subject: cmake error
> >
> > Hello,
> >
> > I am trying to cmake, and I am getting most of the things failed for
> the
> > first time. And also the build is failing after this cmake.
> >
> > o/p:
> >
> > Vector copy elision level set to: single-relation selection
> > -- git Version: v0.0.0
> > -- Version: 0.0.0
> > -- Performing Test HAVE_STD_REGEX
> > -- Performing Test HAVE_STD_REGEX -- success
> > -- Performing Test HAVE_GNU_POSIX_REGEX
> > -- Performing Test HAVE_GNU_POSIX_REGEX -- failed to compile
> > -- Performing Test HAVE_POSIX_REGEX
> > -- Performing Test HAVE_POSIX_REGEX -- success
> > -- Performing Test HAVE_STEADY_CLOCK
> > -- Performing Test HAVE_STEADY_CLOCK -- success
> > -- Checking program counter fetch from ucontext_t member:
> > uc_mcontext.gregs[REG_EIP]
> > -- Performing Test PC_FROM_UCONTEXT_COMPILES
> > -- Performing Test PC_FROM_UCONTEXT_COMPILES - Failed
> > -- Checking program counter fetch from ucontext_t member:
> > uc_mcontext.gregs[REG_RIP]
> > -- Performing Test PC_FROM_UCONTEXT_COMPILES
> > -- Performing Test PC_FROM_UCONTEXT_COMPILES - Failed
> > -- Checking program counter fetch from ucontext_t member:
> uc_mcontext.sc_ip
> > -- Performing Test PC_FROM_UCONTEXT_COMPILES
> > -- Performing Test PC_FROM_UCONTEXT_COMPILES - Failed
> > -- Checking program counter fetch from ucontext_t member:
> > uc_mcontext.uc_regs->gregs[PT_NIP]
> > -- Performing Test PC_FROM_UCONTEXT_COMPILES
> > -- Performing Test PC_FROM_UCONTEXT_COMPILES - Failed
> > -- Checking program counter fetch from ucontext_t member:
> > uc_mcontext.gregs[R15]
> > -- Performing Test PC_FROM_UCONTEXT_COMPILES
> > -- Performing Test PC_FROM_UCONTEXT_COMPILES - Failed
> > -- Checking program counter fetch from ucontext_t member:
> > uc_mcontext.arm_pc
> > -- Performing Test PC_FROM_UCONTEXT_COMPILES
> > -- Performing Test PC_FROM_UCONTEXT_COMPILES - Failed
> > -- Checking program counter fetch from 

Re: cmake error

2017-11-06 Thread Jianqiao
Hi Om,

It seems that your "cmake" output is okay. Can you also provide the "make"
error message?

Best,
Jianqiao

2017-11-06 11:34 GMT-06:00 Harshad Deshmukh <hars...@cs.wisc.edu>:

> Hi Om,
>
> What's your build setup? Did you download the prerequisites and
> initialized the git submodules?
>
> Get Outlook for Android<https://aka.ms/ghei36>
>
> 
> From: Om Jadhav <ojad...@wisc.edu>
> Sent: Friday, November 3, 2017 3:42:05 PM
> To: dev@quickstep.incubator.apache.org
> Subject: cmake error
>
> Hello,
>
> I am trying to cmake, and I am getting most of the things failed for the
> first time. And also the build is failing after this cmake.
>
> o/p:
>
> Vector copy elision level set to: single-relation selection
> -- git Version: v0.0.0
> -- Version: 0.0.0
> -- Performing Test HAVE_STD_REGEX
> -- Performing Test HAVE_STD_REGEX -- success
> -- Performing Test HAVE_GNU_POSIX_REGEX
> -- Performing Test HAVE_GNU_POSIX_REGEX -- failed to compile
> -- Performing Test HAVE_POSIX_REGEX
> -- Performing Test HAVE_POSIX_REGEX -- success
> -- Performing Test HAVE_STEADY_CLOCK
> -- Performing Test HAVE_STEADY_CLOCK -- success
> -- Checking program counter fetch from ucontext_t member:
> uc_mcontext.gregs[REG_EIP]
> -- Performing Test PC_FROM_UCONTEXT_COMPILES
> -- Performing Test PC_FROM_UCONTEXT_COMPILES - Failed
> -- Checking program counter fetch from ucontext_t member:
> uc_mcontext.gregs[REG_RIP]
> -- Performing Test PC_FROM_UCONTEXT_COMPILES
> -- Performing Test PC_FROM_UCONTEXT_COMPILES - Failed
> -- Checking program counter fetch from ucontext_t member: uc_mcontext.sc_ip
> -- Performing Test PC_FROM_UCONTEXT_COMPILES
> -- Performing Test PC_FROM_UCONTEXT_COMPILES - Failed
> -- Checking program counter fetch from ucontext_t member:
> uc_mcontext.uc_regs->gregs[PT_NIP]
> -- Performing Test PC_FROM_UCONTEXT_COMPILES
> -- Performing Test PC_FROM_UCONTEXT_COMPILES - Failed
> -- Checking program counter fetch from ucontext_t member:
> uc_mcontext.gregs[R15]
> -- Performing Test PC_FROM_UCONTEXT_COMPILES
> -- Performing Test PC_FROM_UCONTEXT_COMPILES - Failed
> -- Checking program counter fetch from ucontext_t member:
> uc_mcontext.arm_pc
> -- Performing Test PC_FROM_UCONTEXT_COMPILES
> -- Performing Test PC_FROM_UCONTEXT_COMPILES - Failed
> -- Checking program counter fetch from ucontext_t member:
> uc_mcontext.mc_eip
> -- Performing Test PC_FROM_UCONTEXT_COMPILES
> -- Performing Test PC_FROM_UCONTEXT_COMPILES - Failed
> -- Checking program counter fetch from ucontext_t member:
> uc_mcontext.mc_rip
> -- Performing Test PC_FROM_UCONTEXT_COMPILES
> -- Performing Test PC_FROM_UCONTEXT_COMPILES - Failed
> -- Checking program counter fetch from ucontext_t member:
> uc_mcontext.__gregs[_REG_EIP]
> -- Performing Test PC_FROM_UCONTEXT_COMPILES
> -- Performing Test PC_FROM_UCONTEXT_COMPILES - Failed
> -- Checking program counter fetch from ucontext_t member:
> uc_mcontext.__gregs[_REG_RIP]
> -- Performing Test PC_FROM_UCONTEXT_COMPILES
> -- Performing Test PC_FROM_UCONTEXT_COMPILES - Failed
> -- Checking program counter fetch from ucontext_t member:
> uc_mcontext->ss.eip
> -- Performing Test PC_FROM_UCONTEXT_COMPILES
> -- Performing Test PC_FROM_UCONTEXT_COMPILES - Failed
> -- Checking program counter fetch from ucontext_t member:
> uc_mcontext->__ss.__eip
> -- Performing Test PC_FROM_UCONTEXT_COMPILES
> -- Performing Test PC_FROM_UCONTEXT_COMPILES - Failed
> -- Checking program counter fetch from ucontext_t member:
> uc_mcontext->ss.rip
> -- Performing Test PC_FROM_UCONTEXT_COMPILES
> -- Performing Test PC_FROM_UCONTEXT_COMPILES - Failed
> -- Checking program counter fetch from ucontext_t member:
> uc_mcontext->__ss.__rip
> -- Performing Test PC_FROM_UCONTEXT_COMPILES
> -- Performing Test PC_FROM_UCONTEXT_COMPILES - Failed
> -- Checking program counter fetch from ucontext_t member:
> uc_mcontext->ss.srr0
> -- Performing Test PC_FROM_UCONTEXT_COMPILES
> -- Performing Test PC_FROM_UCONTEXT_COMPILES - Failed
> -- Checking program counter fetch from ucontext_t member:
> uc_mcontext->__ss.__srr0
> -- Performing Test PC_FROM_UCONTEXT_COMPILES
> -- Performing Test PC_FROM_UCONTEXT_COMPILES - Failed
> CMake Warning at third_party/src/glog/CMakeLists.txt:185 (message):
>   Unable to find program counter field in ucontext_t.  GLOG signal handler
>   will not be able to report precise PC position.
>
>
> You appear to be building on a Linux system with HugeTLB support. To take
> advantage of this feature, you will need to configure kernel support for
> hugepages by setting /proc/sys/vm/nr_hugepages and/or
> /proc/sys/vm/nr_overcommit_hugepages 

[GitHub] incubator-quickstep issue #320: Support Multiple Tuple Inserts

2017-10-27 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/320
  
It may just be fine before `ExecutionGenerator`, I think the fix is to 
revise `ExecutionGenerator::convertInsertTuple()` and do some modifications 
inside `InsertOperator`.


---


[GitHub] incubator-quickstep issue #316: Support Multiple Tuple Inserts

2017-10-24 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/316
  
LGTM! Will merge after travis-ci's tests.


---


[GitHub] incubator-quickstep issue #314: Added Vector Aggregation support in the dist...

2017-10-12 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/314
  
LGTM! Merging.


---


[GitHub] incubator-quickstep pull request #315: [DO NOT MERGE] Refactor type system t...

2017-10-11 Thread jianqiao
GitHub user jianqiao opened a pull request:

https://github.com/apache/incubator-quickstep/pull/315

[DO NOT MERGE] Refactor type system to provide better extensibility of 
types and functions

This is a preliminary PR that is not ready to be merged but provides an 
overall view of the type system refactoring work. Many constructs are at their 
initial designs and maybe further improved.

The PR aims at reviewing the refactoring designs at the "architecture" 
level. Detailed code style and unit test issues may be addressed later in 
subsequent concrete PRs.


The overall purpose of the refactoring is to improve the extensibility of 
the existing type/function system (i.e. support more kinds of types/functions 
and make it easier to add new types and functions), while retaining the 
performance of the current system.

### Major Changes
 Part I. Type System
---
# 1. Categorize all types into four [_memory 
layouts_](https://github.com/apache/incubator-quickstep/blob/refactor-type/types/TypeID.hpp#L64).

The four memory layouts are:
* __CxxInlinePod__ (C++ plain old data)
* __ParInlinePod__ (Parameterized inline plain old data)
* __ParOutOfLinePod__ (Parameterized out-of-line plain old data)
* __CxxGeneric__ (C++ generic types)

Memory layout decides how the corresponding type's values are stored and 
represented.

Briefly speaking,
* _CxxInlinePod_ corresponds to C++ primitive types or POD structs.
  * E.g. _int_, _double_, _struct { double x, double y }_.
  * The size of a CxxInlinePod value is known at C++ compile time (e.g 
_double_ has size 8, _struct { double x, double y }_ has size 16).
* _ParInlinePod_ corresponds to database defined "fixed length" types.
  * E.g. _Char(8)_, _Char(20)_.
  * The size of such types' values are not known at C++ compile time. 
Instead, the type is parameterized by an unsigned integer, where the 
parameter's value is known at SQL query compile time (which is C++ run-time).
* _ParOutOfLinePod_ corresponds to database defined "variable length" types.
  * E.g. _Varchar(20)_.
  * The size of such types' values are not known until SQL query run-time.
* _CxxGeneric_ correponds to C++ general types (i.e. any C++ type).
  * E.g. _std::setint_, _std::vectorconst Type*_.
  * Such types have to implement serialization/deserialization methods to 
have storage support.
---
# 2. Use 
[_TypeIDTrait_](https://github.com/apache/incubator-quickstep/blob/refactor-type/types/TypeRegistrar.hpp#L59)
 to allow many information to be known at compile time.

With this per-type trait information, we can avoid many boilerplate code 
for each subclass of _Type_ by using template techniques and specialize on the 
memory layout. See 
[_TypeSynthesizer_](https://github.com/apache/incubator-quickstep/blob/refactor-type/types/TypeSynthesizer.hpp)
 and 
[_TypeFactory_](https://github.com/apache/incubator-quickstep/blob/refactor-type/types/TypeFactory.cpp#L69).

_TypeIDTrait_ is also extensively used in many other places as it provides 
all the required compile-time information about a type.

---

# 3. Support more types.
Details will be written later about how to add a new type into the 
Quickstep system.

The current PR has some example types added:
* The 
[_Bool_](https://github.com/apache/incubator-quickstep/blob/refactor-type/types/BoolType.hpp)
 type. It will be used later for connecting scalar functions and predicates.
* The 
[_Text_](https://github.com/apache/incubator-quickstep/blob/refactor-type/types/TextType.hpp)
 type. A general non-parameterized string type.
  * __TODO:__ We need some updates in the storage block module (potentially 
also other places) to handle the "infinite maximum byte size" types.
* The 
[_MetaType_](https://github.com/apache/incubator-quickstep/blob/refactor-type/types/MetaType-decl.hpp)
 type. It is "type of type". I.e. a value of _MetaType_ has C++ type _const 
Type*_.
* The 
[_Array_](https://github.com/apache/incubator-quickstep/blob/refactor-type/types/ArrayType.hpp)
 type. A generic type that represents an array. This type takes a MetaType 
value as parameter, where the parameter specifies the array's element type.
  * __TODO__: We need specialized array types such as _IntArray_ and 
_TextArray_ for performance consideration.

---
# 4. Improve the type casting mechanism.

Type casting (coersion) is an important feature that is needed in practice 
from time to time.

This PR's design defined an overall 
[template](https://github.com/apache/incubator-quickstep/blob/refactor-type/types/operations/unary_operations/CastFunctorOverloads.hpp#L41)
```
template 
struct CastFunctor;
```
which is then specialized by different source/target 

[GitHub] incubator-quickstep issue #304: Added a new set API for TupleIdSequence.

2017-10-09 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/304
  
LGTM! Merging.


---


[GitHub] incubator-quickstep pull request #299: QUICKSTEP-104 Fix the problem that Li...

2017-09-20 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/299#discussion_r140061423
  
--- Diff: cli/LineReader.cpp ---
@@ -171,7 +173,7 @@ std::string LineReader::getNextCommand() {
 case '.':
 case '\\':  //  Fall Through.
   // If the dot or forward slash begins the line, begin a 
command search.
-  if (scan_position == 0) {
+  if (special_char_location == 
multiline_buffer.find_first_not_of(" \t\r\n")) {
--- End diff --

Yes it should cover all the possible situations.


---


[GitHub] incubator-quickstep pull request #299: QUICKSTEP-104 Fix the problem that Li...

2017-09-20 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/299#discussion_r140060711
  
--- Diff: cli/tests/command_executor/D.test ---
@@ -69,6 +71,7 @@ INSERT INTO foo3 values(5, 1, 1.0, 1.0, 'XYZZ');
  col4   | Float  
  col5   | Char(5)
 ==
+
--- End diff --

Probably no. The test input does not go through `LineReader`.


---


[GitHub] incubator-quickstep pull request #299: QUICKSTEP-104 Fix the problem that Li...

2017-09-19 Thread jianqiao
GitHub user jianqiao opened a pull request:

https://github.com/apache/incubator-quickstep/pull/299

QUICKSTEP-104 Fix the problem that LineReader cannot recognize a command if 
there are whitespaces before it.

This PR fixes a bug that the Quickstep REPL cannot recognize a command 
(e.g. `\d`, `\analyze`) if there are whitespaces or empty lines before the 
command.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jianqiao/incubator-quickstep 
fix-extract-command

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-quickstep/pull/299.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #299


commit 77960a42dcfb3d27de5601548a04d81a6be79375
Author: Jianqiao Zhu <jianq...@cs.wisc.edu>
Date:   2017-09-20T03:02:02Z

Fix a bug in LineReader for recognizing command




---


[GitHub] incubator-quickstep issue #298: Prune columns after partition rule.

2017-09-19 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/298
  
LGTM. Merging.


---


[GitHub] incubator-quickstep issue #297: Fixed a bug in partitioned NLJ.

2017-09-14 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/297
  
LGTM. Merging.


---


[GitHub] incubator-quickstep issue #271: QUICKSTEP-95: Fixed the exception due to zer...

2017-09-14 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/271
  
LGTM. Merging.


---


[GitHub] incubator-quickstep issue #296: QUICKSTEP-78: Displayed Partition Info using...

2017-09-11 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/296
  
LGTM. Merging.


---


[GitHub] incubator-quickstep issue #293: Added Partition rules for Sort.

2017-09-11 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/293
  
LGTM. Merging.


---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136199350
  
--- Diff: relational_operators/TextScanOperator.cpp ---
@@ -82,14 +81,19 @@ static bool ValidateTextScanTextSegmentSize(const char 
*flagname,
   return true;
 }
 
-static const volatile bool text_scan_text_segment_size_dummy = 
gflags::RegisterFlagValidator(
-_textscan_text_segment_size, );
+static const volatile bool text_scan_text_segment_size_dummy =
+gflags::RegisterFlagValidator(
+_textscan_text_segment_size, 
);
 
 namespace {
 
-size_t getFileSize(const string _name) {
+static std::size_t GetFileSize(const std::string _name) {
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136198329
  
--- Diff: query_optimizer/resolver/Resolver.cpp ---
@@ -143,6 +145,45 @@ namespace E = ::quickstep::optimizer::expressions;
 namespace L = ::quickstep::optimizer::logical;
 namespace S = ::quickstep::serialization;
 
+namespace {
+
+static attribute_id GetAttributeIdFromName(
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136197963
  
--- Diff: query_optimizer/logical/LogicalType.hpp ---
@@ -34,6 +34,7 @@ namespace logical {
 enum class LogicalType {
   kAggregate,
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136197848
  
--- Diff: query_optimizer/logical/CopyTo.hpp ---
@@ -0,0 +1,141 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ **/
+
+#ifndef QUICKSTEP_QUERY_OPTIMIZER_LOGICAL_COPY_TO_HPP_
+#define QUICKSTEP_QUERY_OPTIMIZER_LOGICAL_COPY_TO_HPP_
+
+#include 
+#include 
+#include 
+
+#include "query_optimizer/OptimizerTree.hpp"
+#include "query_optimizer/expressions/AttributeReference.hpp"
+#include "query_optimizer/logical/Logical.hpp"
+#include "query_optimizer/logical/LogicalType.hpp"
+#include "utility/BulkIOConfiguration.hpp"
+#include "utility/Macros.hpp"
+
+#include "glog/logging.h"
+
+namespace quickstep {
+namespace optimizer {
+namespace logical {
+
+/** \addtogroup OptimizerLogical
+ *  @{
+ */
+
+class CopyTo;
+typedef std::shared_ptr CopyToPtr;
+
+/**
+ * @brief Represents an operation that copies data from a relation to a 
text file.
+ */
+class CopyTo : public Logical {
+ public:
+  LogicalType getLogicalType() const override {
+return LogicalType::kCopyTo;
+  }
+
+  std::string getName() const override {
+return "CopyTo";
+  }
+
+  /**
+   * @return The input relation whose data is to be exported.
+   */
+  const LogicalPtr& input() const {
+return input_;
+  }
+
+  /**
+   * @return The name of the file to write the data to.
+   */
+  const std::string& file_name() const {
+return file_name_;
+  }
+
+  /**
+   * @return The options for this COPY TO statement.
+   */
+  BulkIOConfigurationPtr options() const {
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136180815
  
--- Diff: query_optimizer/logical/CopyFrom.hpp ---
@@ -66,20 +67,14 @@ class CopyFrom : public Logical {
   const std::string& file_name() const { return file_name_; }
 
   /**
-   * @return The delimiter used in the text file to separate columns.
+   * @return The options for this COPY FROM statement.
*/
-  const char column_delimiter() const { return column_delimiter_; }
-
-  /**
-   * @return Whether to decode escape sequences in the text file.
-   */
-  bool escape_strings() const { return escape_strings_; }
+  BulkIOConfigurationPtr options() const { return options_; }
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136178849
  
--- Diff: parser/ParseStatement.hpp ---
@@ -771,122 +776,135 @@ class ParseStatementInsertSelection : public 
ParseStatementInsert {
   DISALLOW_COPY_AND_ASSIGN(ParseStatementInsertSelection);
 };
 
+
 /**
- * @brief Optional parameters for a COPY FROM statement.
+ * @brief The parsed representation of a COPY FROM/COPY TO statement.
  **/
-struct ParseCopyFromParams : public ParseTreeNode {
-  /**
-   * @brief Constructor, sets default values.
-   **/
-  ParseCopyFromParams(const int line_number, const int column_number)
-  : ParseTreeNode(line_number, column_number),
-escape_strings(true) {
-  }
-
-  std::string getName() const override { return "CopyFromParams"; }
-
+class ParseStatementCopy : public ParseStatement {
+ public:
   /**
-   * @brief Sets the column delimiter.
-   *
-   * @param delimiter_in The column delimiter string.
+   * @brief Copy direction (FROM text file/TO text file).
*/
-  void set_delimiter(ParseString* delimiter_in) {
-delimiter.reset(delimiter_in);
-  }
-
-  /**
-   * @brief The string which terminates individual attribute values in the
-   *input file. Can be NULL.
-   **/
-  std::unique_ptr delimiter;
+  enum CopyDirection {
+kFrom,
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136199193
  
--- Diff: relational_operators/TableExportOperator.hpp ---
@@ -0,0 +1,268 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ **/
+
+#ifndef QUICKSTEP_RELATIONAL_OPERATORS_TABLE_EXPORT_OPERATOR_HPP_
+#define QUICKSTEP_RELATIONAL_OPERATORS_TABLE_EXPORT_OPERATOR_HPP_
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "catalog/CatalogRelation.hpp"
+#include "catalog/CatalogTypedefs.hpp"
+#include "query_execution/QueryContext.hpp"
+#include "relational_operators/RelationalOperator.hpp"
+#include "relational_operators/WorkOrder.hpp"
+#include "storage/StorageBlockInfo.hpp"
+#include "threading/SpinMutex.hpp"
+#include "utility/BulkIOConfiguration.hpp"
+#include "utility/Macros.hpp"
+
+#include "glog/logging.h"
+
+#include "tmb/id_typedefs.h"
+
+namespace tmb { class MessageBus; }
+
+namespace quickstep {
+
+class CatalogRelationSchema;
+class StorageManager;
+class ValueAccessor;
+class WorkOrderProtosContainer;
+class WorkOrdersContainer;
+
+namespace serialization { class WorkOrder; }
+
+/** \addtogroup RelationalOperators
+ *  @{
+ */
+
+class TableExportOperator : public RelationalOperator {
+ public:
+  /**
+   * @brief Feedback message to Foreman when a 
TableExportToStringWorkOrder has
+   *completed writing a block to the string buffer.
+   */
+  enum FeedbackMessageType : WorkOrder::FeedbackMessageType {
+  kBlockOutputMessage,
+  };
+
+  /**
+   * @brief Constructor.
+   *
+   * @param query_id The ID of the query to which this operator belongs.
+   * @param input_relation The relation to export.
+   * @param input_relation_is_stored If input_relation is a stored 
relation and
+   *is fully available to the operator before it can start 
generating
+   *workorders.
+   * @param file_name The name of the file to export the relation to.
+   * @param options The options that specify the detailed format of the 
output
+   *file.
+   */
+  TableExportOperator(const std::size_t query_id,
+  const CatalogRelation _relation,
+  const bool input_relation_is_stored,
+  const std::string _name,
+  const BulkIOConfigurationPtr )
+  : RelationalOperator(query_id),
+input_relation_(input_relation),
+input_relation_is_stored_(input_relation_is_stored),
+file_name_(file_name),
+options_(options),
+input_relation_block_ids_(input_relation_is_stored
+  ? input_relation.getBlocksSnapshot()
+  : std::vector()),
+num_workorders_generated_(0),
+started_(false),
+num_blocks_written_(0),
+file_(nullptr) {}
+
+  ~TableExportOperator() override {}
+
+  OperatorType getOperatorType() const override {
+return kTableExport;
+  }
+
+  std::string getName() const override {
+return "TableExportOperator";
+  }
+
+  /**
+   * @return The relation to export.
+   */
+  const CatalogRelation& input_relation() const {
+return input_relation_;
+  }
+
+  bool getAllWorkOrders(WorkOrdersContainer *container,
+QueryContext *query_context,
+StorageManager *storage_manager,
+const tmb::client_id scheduler_client_id,
+tmb::MessageBus *bus) override;
+
+  bool getAllWorkOrderProtos(WorkOrderProtosContainer *con

[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136198182
  
--- Diff: query_optimizer/physical/CopyFrom.hpp ---
@@ -68,22 +69,14 @@ class CopyFrom : public Physical {
   const std::string& file_name() const { return file_name_; }
 
   /**
-   * @return The delimiter used in the text file to separate columns.
+   * @return The options for this COPY FROM statement.
*/
-  const char column_delimiter() const { return column_delimiter_; }
-
-  /**
-   * @return Whether to decode escape sequences in the text file.
-   */
-  bool escape_strings() const { return escape_strings_; }
+  BulkIOConfigurationPtr options() const { return options_; }
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136198572
  
--- Diff: query_optimizer/resolver/Resolver.cpp ---
@@ -418,27 +455,157 @@ L::LogicalPtr Resolver::resolve(const ParseStatement 
_query) {
 }
 
 L::LogicalPtr Resolver::resolveCopyFrom(
-const ParseStatementCopyFrom _from_statement) {
-  // Default parameters.
-  std::string column_delimiter_ = "\t";
-  bool escape_strings_ = true;
+const ParseStatementCopy _from_statement) {
+  DCHECK(copy_from_statement.getCopyDirection() == 
ParseStatementCopy::kFrom);
+  const PtrList *params = copy_from_statement.params();
 
-  const ParseCopyFromParams *params = copy_from_statement.params();
+  BulkIOFormat file_format = BulkIOFormat::kText;
   if (params != nullptr) {
-if (params->delimiter != nullptr) {
-  column_delimiter_ = params->delimiter->value();
-  if (column_delimiter_.size() != 1) {
-THROW_SQL_ERROR_AT(params->delimiter)
-<< "DELIMITER is not a single character";
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+// TODO(jianqiao): Support other bulk load formats such as CSV.
+if (format != "text") {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+// Update file_format when other formats get supported.
+break;
+  }
+}
+  }
+
+  std::unique_ptr options =
+  std::make_unique(file_format);
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "delimiter") {
+const ParseString *parse_delimiter = GetKeyValueString(param);
+const std::string  = parse_delimiter->value();
+if (delimiter.size() != 1) {
+  THROW_SQL_ERROR_AT(parse_delimiter)
+  << "DELIMITER is not a single character";
+}
+options->setDelimiter(delimiter.front());
+  } else if (key == "escape_strings") {
+options->setEscapeStrings(GetKeyValueBool(param));
+  } else if (key != "format") {
+THROW_SQL_ERROR_AT() << "Unsupported copy option: " << key;
   }
 }
-escape_strings_ = params->escape_strings;
   }
 
   return 
L::CopyFrom::Create(resolveRelationName(copy_from_statement.relation_name()),
- 
copy_from_statement.source_filename()->value(),
- column_delimiter_[0],
- escape_strings_);
+ copy_from_statement.file_name()->value(),
+ BulkIOConfigurationPtr(options.release()));
+}
+
+L::LogicalPtr Resolver::resolveCopyTo(
+const ParseStatementCopy _to_statement) {
+  DCHECK(copy_to_statement.getCopyDirection() == ParseStatementCopy::kTo);
+  const PtrList *params = copy_to_statement.params();
+
+  // Check if copy format is explicitly specified.
+  BulkIOFormat file_format = BulkIOFormat::kText;
+  bool format_specified = false;
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+if (format == "csv") {
+  file_format = BulkIOFormat::kCSV;
+} else if (format == "text") {
+  file_format = BulkIOFormat::kText;
+} else {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+format_specified = true;
+break;
+  }
+}
+  }
+
+  const std::string _name = copy_to_statement.file_name()->value();
+  if (file_name.length() <= 1) {
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136198255
  
--- Diff: query_optimizer/physical/CopyTo.hpp ---
@@ -0,0 +1,147 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ **/
+
+#ifndef QUICKSTEP_QUERY_OPTIMIZER_PHYSICAL_COPY_TO_HPP_
+#define QUICKSTEP_QUERY_OPTIMIZER_PHYSICAL_COPY_TO_HPP_
+
+#include 
+#include 
+#include 
+
+#include "query_optimizer/OptimizerTree.hpp"
+#include "query_optimizer/expressions/AttributeReference.hpp"
+#include "query_optimizer/physical/Physical.hpp"
+#include "query_optimizer/physical/PhysicalType.hpp"
+#include "utility/BulkIOConfiguration.hpp"
+#include "utility/Macros.hpp"
+
+#include "glog/logging.h"
+
+namespace quickstep {
+namespace optimizer {
+namespace physical {
+
+/** \addtogroup OptimizerPhysical
+ *  @{
+ */
+
+class CopyTo;
+typedef std::shared_ptr CopyToPtr;
+
+/**
+ * @brief Represents an operation that copies data from a relation to a 
text file.
+ */
+class CopyTo : public Physical {
+ public:
+  PhysicalType getPhysicalType() const override {
+return PhysicalType::kCopyTo;
+  }
+
+  std::string getName() const override {
+return "CopyTo";
+  }
+
+  /**
+   * @return The input relation whose data is to be exported.
+   */
+  const PhysicalPtr& input() const {
+return input_;
+  }
+
+  /**
+   * @return The name of the file to write the data to.
+   */
+  const std::string& file_name() const {
+return file_name_;
+  }
+
+  /**
+   * @return The options for this COPY TO statement.
+   */
+  BulkIOConfigurationPtr options() const {
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136198306
  
--- Diff: query_optimizer/physical/PhysicalType.hpp ---
@@ -34,6 +34,7 @@ namespace physical {
 enum class PhysicalType {
   kAggregate,
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136198462
  
--- Diff: query_optimizer/resolver/Resolver.cpp ---
@@ -418,27 +455,157 @@ L::LogicalPtr Resolver::resolve(const ParseStatement 
_query) {
 }
 
 L::LogicalPtr Resolver::resolveCopyFrom(
-const ParseStatementCopyFrom _from_statement) {
-  // Default parameters.
-  std::string column_delimiter_ = "\t";
-  bool escape_strings_ = true;
+const ParseStatementCopy _from_statement) {
+  DCHECK(copy_from_statement.getCopyDirection() == 
ParseStatementCopy::kFrom);
+  const PtrList *params = copy_from_statement.params();
 
-  const ParseCopyFromParams *params = copy_from_statement.params();
+  BulkIOFormat file_format = BulkIOFormat::kText;
   if (params != nullptr) {
-if (params->delimiter != nullptr) {
-  column_delimiter_ = params->delimiter->value();
-  if (column_delimiter_.size() != 1) {
-THROW_SQL_ERROR_AT(params->delimiter)
-<< "DELIMITER is not a single character";
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+// TODO(jianqiao): Support other bulk load formats such as CSV.
+if (format != "text") {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+// Update file_format when other formats get supported.
+break;
+  }
+}
+  }
+
+  std::unique_ptr options =
+  std::make_unique(file_format);
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136198547
  
--- Diff: query_optimizer/resolver/Resolver.cpp ---
@@ -418,27 +455,157 @@ L::LogicalPtr Resolver::resolve(const ParseStatement 
_query) {
 }
 
 L::LogicalPtr Resolver::resolveCopyFrom(
-const ParseStatementCopyFrom _from_statement) {
-  // Default parameters.
-  std::string column_delimiter_ = "\t";
-  bool escape_strings_ = true;
+const ParseStatementCopy _from_statement) {
+  DCHECK(copy_from_statement.getCopyDirection() == 
ParseStatementCopy::kFrom);
+  const PtrList *params = copy_from_statement.params();
 
-  const ParseCopyFromParams *params = copy_from_statement.params();
+  BulkIOFormat file_format = BulkIOFormat::kText;
   if (params != nullptr) {
-if (params->delimiter != nullptr) {
-  column_delimiter_ = params->delimiter->value();
-  if (column_delimiter_.size() != 1) {
-THROW_SQL_ERROR_AT(params->delimiter)
-<< "DELIMITER is not a single character";
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+// TODO(jianqiao): Support other bulk load formats such as CSV.
+if (format != "text") {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+// Update file_format when other formats get supported.
+break;
+  }
+}
+  }
+
+  std::unique_ptr options =
+  std::make_unique(file_format);
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "delimiter") {
+const ParseString *parse_delimiter = GetKeyValueString(param);
+const std::string  = parse_delimiter->value();
+if (delimiter.size() != 1) {
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136199047
  
--- Diff: query_optimizer/resolver/Resolver.cpp ---
@@ -1595,6 +1742,19 @@ void 
Resolver::appendProjectIfNeedPrecomputationAfterAggregation(
   }
 }
 
+void Resolver::reportIfWithClauseUnused(
+const PtrVector _list) const {
+  if (!with_queries_info_.unreferenced_query_indexes.empty()) {
+int unreferenced_with_query_index = 
*with_queries_info_.unreferenced_query_indexes.begin();
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136199940
  
--- Diff: utility/ExecutionDAGVisualizer.cpp ---
@@ -55,12 +55,15 @@ using std::to_string;
 namespace quickstep {
 
 DEFINE_bool(visualize_execution_dag_partition_info, false,
-"If true, display the operator partition info in the 
visualized execution plan DAG."
-"Valid iif 'visualize_execution_dag' turns on.");
+"If true, display the operator partition info in the 
visualized "
+"execution plan DAG. Valid if 'visualize_execution_dag' turns 
on.");
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136199148
  
--- Diff: relational_operators/TableExportOperator.hpp ---
@@ -0,0 +1,268 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ **/
+
+#ifndef QUICKSTEP_RELATIONAL_OPERATORS_TABLE_EXPORT_OPERATOR_HPP_
+#define QUICKSTEP_RELATIONAL_OPERATORS_TABLE_EXPORT_OPERATOR_HPP_
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "catalog/CatalogRelation.hpp"
+#include "catalog/CatalogTypedefs.hpp"
+#include "query_execution/QueryContext.hpp"
+#include "relational_operators/RelationalOperator.hpp"
+#include "relational_operators/WorkOrder.hpp"
+#include "storage/StorageBlockInfo.hpp"
+#include "threading/SpinMutex.hpp"
+#include "utility/BulkIOConfiguration.hpp"
+#include "utility/Macros.hpp"
+
+#include "glog/logging.h"
+
+#include "tmb/id_typedefs.h"
+
+namespace tmb { class MessageBus; }
+
+namespace quickstep {
+
+class CatalogRelationSchema;
+class StorageManager;
+class ValueAccessor;
+class WorkOrderProtosContainer;
+class WorkOrdersContainer;
+
+namespace serialization { class WorkOrder; }
+
+/** \addtogroup RelationalOperators
+ *  @{
+ */
+
+class TableExportOperator : public RelationalOperator {
+ public:
+  /**
+   * @brief Feedback message to Foreman when a 
TableExportToStringWorkOrder has
+   *completed writing a block to the string buffer.
+   */
+  enum FeedbackMessageType : WorkOrder::FeedbackMessageType {
+  kBlockOutputMessage,
+  };
+
+  /**
+   * @brief Constructor.
+   *
+   * @param query_id The ID of the query to which this operator belongs.
+   * @param input_relation The relation to export.
+   * @param input_relation_is_stored If input_relation is a stored 
relation and
+   *is fully available to the operator before it can start 
generating
+   *workorders.
+   * @param file_name The name of the file to export the relation to.
+   * @param options The options that specify the detailed format of the 
output
+   *file.
+   */
+  TableExportOperator(const std::size_t query_id,
+  const CatalogRelation _relation,
+  const bool input_relation_is_stored,
+  const std::string _name,
+  const BulkIOConfigurationPtr )
+  : RelationalOperator(query_id),
+input_relation_(input_relation),
+input_relation_is_stored_(input_relation_is_stored),
+file_name_(file_name),
+options_(options),
+input_relation_block_ids_(input_relation_is_stored
+  ? input_relation.getBlocksSnapshot()
+  : std::vector()),
+num_workorders_generated_(0),
+started_(false),
+num_blocks_written_(0),
+file_(nullptr) {}
+
+  ~TableExportOperator() override {}
+
+  OperatorType getOperatorType() const override {
+return kTableExport;
+  }
+
+  std::string getName() const override {
+return "TableExportOperator";
+  }
+
+  /**
+   * @return The relation to export.
+   */
+  const CatalogRelation& input_relation() const {
+return input_relation_;
+  }
+
+  bool getAllWorkOrders(WorkOrdersContainer *container,
+QueryContext *query_context,
+StorageManager *storage_manager,
+const tmb::client_id scheduler_client_id,
+tmb::MessageBus *bus) override;
+
+  bool getAllWorkOrderProtos(WorkOrderProtosContainer *con

[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136198676
  
--- Diff: query_optimizer/resolver/Resolver.cpp ---
@@ -418,27 +455,157 @@ L::LogicalPtr Resolver::resolve(const ParseStatement 
_query) {
 }
 
 L::LogicalPtr Resolver::resolveCopyFrom(
-const ParseStatementCopyFrom _from_statement) {
-  // Default parameters.
-  std::string column_delimiter_ = "\t";
-  bool escape_strings_ = true;
+const ParseStatementCopy _from_statement) {
+  DCHECK(copy_from_statement.getCopyDirection() == 
ParseStatementCopy::kFrom);
+  const PtrList *params = copy_from_statement.params();
 
-  const ParseCopyFromParams *params = copy_from_statement.params();
+  BulkIOFormat file_format = BulkIOFormat::kText;
   if (params != nullptr) {
-if (params->delimiter != nullptr) {
-  column_delimiter_ = params->delimiter->value();
-  if (column_delimiter_.size() != 1) {
-THROW_SQL_ERROR_AT(params->delimiter)
-<< "DELIMITER is not a single character";
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+// TODO(jianqiao): Support other bulk load formats such as CSV.
+if (format != "text") {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+// Update file_format when other formats get supported.
+break;
+  }
+}
+  }
+
+  std::unique_ptr options =
+  std::make_unique(file_format);
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "delimiter") {
+const ParseString *parse_delimiter = GetKeyValueString(param);
+const std::string  = parse_delimiter->value();
+if (delimiter.size() != 1) {
+  THROW_SQL_ERROR_AT(parse_delimiter)
+  << "DELIMITER is not a single character";
+}
+options->setDelimiter(delimiter.front());
+  } else if (key == "escape_strings") {
+options->setEscapeStrings(GetKeyValueBool(param));
+  } else if (key != "format") {
+THROW_SQL_ERROR_AT() << "Unsupported copy option: " << key;
   }
 }
-escape_strings_ = params->escape_strings;
   }
 
   return 
L::CopyFrom::Create(resolveRelationName(copy_from_statement.relation_name()),
- 
copy_from_statement.source_filename()->value(),
- column_delimiter_[0],
- escape_strings_);
+ copy_from_statement.file_name()->value(),
+ BulkIOConfigurationPtr(options.release()));
+}
+
+L::LogicalPtr Resolver::resolveCopyTo(
+const ParseStatementCopy _to_statement) {
+  DCHECK(copy_to_statement.getCopyDirection() == ParseStatementCopy::kTo);
+  const PtrList *params = copy_to_statement.params();
+
+  // Check if copy format is explicitly specified.
+  BulkIOFormat file_format = BulkIOFormat::kText;
+  bool format_specified = false;
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+if (format == "csv") {
+  file_format = BulkIOFormat::kCSV;
+} else if (format == "text") {
+  file_format = BulkIOFormat::kText;
+} else {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+format_specified = true;
+break;
+  }
+}
+  }
+
+  const std::string _name = copy_to_statement.file_name()->value();
+  if (file_name.length() <= 1) {
+THROW_SQL_ERROR_AT(copy_to_statement.file_name())
+<< "File name can not be empty";
+  }
+
+  // Infer copy format from file name extension.
+  if (!format_specified) {
+if (file_name.length() > 4) {
+  if (ToLower(file_name.substr(file_name.length() - 4)) == ".csv") {
+

[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136199029
  
--- Diff: query_optimizer/resolver/Resolver.cpp ---
@@ -418,27 +455,157 @@ L::LogicalPtr Resolver::resolve(const ParseStatement 
_query) {
 }
 
 L::LogicalPtr Resolver::resolveCopyFrom(
-const ParseStatementCopyFrom _from_statement) {
-  // Default parameters.
-  std::string column_delimiter_ = "\t";
-  bool escape_strings_ = true;
+const ParseStatementCopy _from_statement) {
+  DCHECK(copy_from_statement.getCopyDirection() == 
ParseStatementCopy::kFrom);
+  const PtrList *params = copy_from_statement.params();
 
-  const ParseCopyFromParams *params = copy_from_statement.params();
+  BulkIOFormat file_format = BulkIOFormat::kText;
   if (params != nullptr) {
-if (params->delimiter != nullptr) {
-  column_delimiter_ = params->delimiter->value();
-  if (column_delimiter_.size() != 1) {
-THROW_SQL_ERROR_AT(params->delimiter)
-<< "DELIMITER is not a single character";
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+// TODO(jianqiao): Support other bulk load formats such as CSV.
+if (format != "text") {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+// Update file_format when other formats get supported.
+break;
+  }
+}
+  }
+
+  std::unique_ptr options =
+  std::make_unique(file_format);
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "delimiter") {
+const ParseString *parse_delimiter = GetKeyValueString(param);
+const std::string  = parse_delimiter->value();
+if (delimiter.size() != 1) {
+  THROW_SQL_ERROR_AT(parse_delimiter)
+  << "DELIMITER is not a single character";
+}
+options->setDelimiter(delimiter.front());
+  } else if (key == "escape_strings") {
+options->setEscapeStrings(GetKeyValueBool(param));
+  } else if (key != "format") {
+THROW_SQL_ERROR_AT() << "Unsupported copy option: " << key;
   }
 }
-escape_strings_ = params->escape_strings;
   }
 
   return 
L::CopyFrom::Create(resolveRelationName(copy_from_statement.relation_name()),
- 
copy_from_statement.source_filename()->value(),
- column_delimiter_[0],
- escape_strings_);
+ copy_from_statement.file_name()->value(),
+ BulkIOConfigurationPtr(options.release()));
+}
+
+L::LogicalPtr Resolver::resolveCopyTo(
+const ParseStatementCopy _to_statement) {
+  DCHECK(copy_to_statement.getCopyDirection() == ParseStatementCopy::kTo);
+  const PtrList *params = copy_to_statement.params();
+
+  // Check if copy format is explicitly specified.
+  BulkIOFormat file_format = BulkIOFormat::kText;
+  bool format_specified = false;
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+if (format == "csv") {
+  file_format = BulkIOFormat::kCSV;
+} else if (format == "text") {
+  file_format = BulkIOFormat::kText;
+} else {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+format_specified = true;
+break;
+  }
+}
+  }
+
+  const std::string _name = copy_to_statement.file_name()->value();
+  if (file_name.length() <= 1) {
+THROW_SQL_ERROR_AT(copy_to_statement.file_name())
+<< "File name can not be empty";
+  }
+
+  // Infer copy format from file name extension.
+  if (!format_specified) {
+if (file_name.length() > 4) {
+  if (ToLower(file_name.substr(file_name.length() - 4)) == ".csv") {
+

[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136178749
  
--- Diff: parser/ParseStatement.hpp ---
@@ -60,16 +60,16 @@ class ParseStatement : public ParseTreeNode {
* @brief The possible types of SQL statements.
**/
   enum StatementType {
-kCreateTable,
+kCommand,
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136175863
  
--- Diff: parser/ParseKeyValue.hpp ---
@@ -37,14 +37,15 @@ namespace quickstep {
  */
 class ParseKeyValue : public ParseTreeNode {
  public:
-  enum class KeyValueType {
+  enum KeyValueType {
+kStringBool,
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
GitHub user jianqiao opened a pull request:

https://github.com/apache/incubator-quickstep/pull/291

Add "COPY TO" operator for exporting data from Quickstep.

This PR adds support for the "COPY TO" statement for exporting tables from 
Quickstep. Two formats `TEXT` and `CSV` are supported.

The current available copy options are:
- `FORMAT`: Format of the output file, either `TEXT` or `CSV`.
- `DELIMITER`: Separator character of the fields.
- `HEADER`: Whether to add table header. For `CSV` format only.
- `QUOTE`: The quote character. For `CSV` format only.
- `ESCAPE_STRINGS`: Whether to escape special characters. For `TEXT` format 
only.
- `NULL_STRING`: The string representation of the `NULL` value.

See the example queries and results 
[here](https://github.com/apache/incubator-quickstep/blob/a036acb446f137fea263ae218ef12f337f5bc1a1/query_optimizer/tests/execution_generator/Copy.test).
 

Note that some convenient features are also provided:
- Export the result table from a query.
```
-- (1) --
COPY
  SELECT x FROM r
TO 'data.txt';

-- (2) --
WITH s(v) AS (
  SELECT MIN(y) FROM r GROUP BY x
)
COPY
  SELECT AVG(v) FROM s
TO 'results.csv';
```

- Print to standard output/error stream, e.g.
```
-- (1) --
COPY r TO stdout;

-- (2) --
COPY 
  SELECT x FROM r
TO stderr;
```

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apache/incubator-quickstep copy-to

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-quickstep/pull/291.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #291


commit a036acb446f137fea263ae218ef12f337f5bc1a1
Author: Jianqiao Zhu <jianq...@cs.wisc.edu>
Date:   2017-08-04T21:49:45Z

Add "COPY TO" operator for exporting data from Quickstep.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep issue #289: Minor refactored SortMergeRunOperator.

2017-08-30 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/289
  
LGTM. Merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep issue #290: Fixed the bug that missed assigning 'num_par...

2017-08-28 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/290
  
LGTM. Merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep issue #282: QUICKSTEP-92: Improved ExecutionDAGVisualize...

2017-08-28 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/282
  
LGTM! Merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep issue #280: Removed an unnecessary API in RelationalOper...

2017-08-24 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/280
  
LGTM! Merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep issue #279: Applied WorkOrderSelectionPolicy.

2017-08-18 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/279
  
I think the PR looks good and the code structure is convenient for further 
adjustment. So merge it now so that @zuyu can continue the work on 
partition-aware scheduling.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep issue #276: Fixed the check failure if a query does not ...

2017-08-03 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/276
  
LGTM! Merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep issue #274: Determine #InitPartitions for CollisionFreeV...

2017-08-02 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/274
  
LGTM! Merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep issue #273: Determine #Partitions for Aggr State Hash Ta...

2017-08-02 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/273
  
LGTM! Merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep issue #265: Added Partition Rule For NestedLoopsJoin.

2017-07-14 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/265
  
LGTM! Merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep issue #264: Added physical rule for partitioned aggregat...

2017-07-14 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/264
  
LGTM. Merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep issue #270: Collapse Selections with predicates.

2017-07-12 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/270
  
LGTM! Merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep issue #267: Refactored PartitionAwareInsertDestination::...

2017-07-10 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/267
  
LGTM! Merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep issue #257: Added Partition Rule for HashJoin.

2017-06-15 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/257
  
Good refactoring. LGTM! Except one place in `PhysicalGenerator.cpp`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #257: Added Partition Rule for HashJoin.

2017-06-15 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/257#discussion_r122326409
  
--- Diff: query_optimizer/PhysicalGenerator.cpp ---
@@ -176,6 +177,8 @@ P::PhysicalPtr PhysicalGenerator::optimizePlan() {
 rules.emplace_back(new AttachLIPFilters());
   }
 
+  rules.push_back(std::make_unique(optimizer_context_));
--- End diff --

The other question is that -- is this rule always needed to be applied 
(e.g. even in single-node mode)? If not, we may have a flag or `#ifdef` to 
conditionally apply this rule.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep issue #249: QUICKSTEP-76: Enabled LIP in the distributed...

2017-06-13 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/249
  
LGTM! Merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep issue #252: Simplified ExtractCommonSubexpression rule a...

2017-06-01 Thread jianqiao
Github user jianqiao commented on the issue:

https://github.com/apache/incubator-quickstep/pull/252
  
LGTM. Can you rebase the commit w.r.t. the current master HEAD? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #246: Improve disjunctive predicate pushdow...

2017-05-04 Thread jianqiao
GitHub user jianqiao opened a pull request:

https://github.com/apache/incubator-quickstep/pull/246

Improve disjunctive predicate pushdown to optimize more scenarios

This PR improves the `PushDownLowCostDisjunctivePredicate` optimization to 
consider more applicable scenarios -- precisely it will push down the partial 
predicate when it has a low selectivity. This results in 4X speedup for TPC-H 
Q19.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apache/incubator-quickstep improve-pushdown

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-quickstep/pull/246.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #246


commit ed72e2477033438e922195aa2632003ab037ab7b
Author: Jianqiao Zhu <jianq...@cs.wisc.edu>
Date:   2017-05-03T04:55:52Z

Improve disjunctive predicate pushdown to optimize more scenarios.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #244: Fix a problem in CopyGroupList + mino...

2017-05-03 Thread jianqiao
GitHub user jianqiao opened a pull request:

https://github.com/apache/incubator-quickstep/pull/244

Fix a problem in CopyGroupList + minor style fixes

This PR adjusts the implementation of `CopyGroupList::merge_contiguous()` 
to handle some cases correctly.

E.g. let `R`be a split-row-store relation with integer attributes 
`x`,`y`,`z`. It is expected that for query
```
SELECT x, y, z
FROM R;
```
`merge_contiguous()` merges all attributes into one `ContiguousAttrs` copy 
group, with `bytes_to_copy` set to 12 (size of three integers).

It is also expected that for query
```
SELECT x, y, y
FROM R;
```
`merge_contiguous()` merges all attributes into two `ContiguousAttrs` copy 
groups, one for `x, y` and one for the last `y`.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apache/incubator-quickstep fix-copy-group

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-quickstep/pull/244.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #244


commit 72c7703102f6c367fe346b6bdcde03f1f0036d7b
Author: Jianqiao Zhu <jianq...@cs.wisc.edu>
Date:   2017-05-03T04:55:52Z

Fix a problem in CopyGroupList + minor style fixes.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #237: QUICKSTEP-89 Add support for common s...

2017-04-24 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/237#discussion_r113070938
  
--- Diff: query_optimizer/expressions/Scalar.hpp ---
@@ -65,10 +67,49 @@ class Scalar : public Expression {
   const std::unordered_map<ExprId, const CatalogAttribute*>& 
substitution_map)
   const = 0;
 
+  /**
+   * @brief Check whether this scalar is semantically equivalent to \p 
other.
+   *
+   * @note The fact that two scalars are semantically equal brings more
+   *   optimization opportunities, e.g. common subexpression 
elimination.
+   *   Meanwhile, it is always safe to assume that two scalars are not 
equal.
+   *
+   * @return True if this scalar equals \p other; false otherwise.
+   */
+  virtual bool equals(const ScalarPtr ) const {
+return false;
+  }
+
+  /**
+   * @brief Get a hash of this scalar.
+   *
+   * @return A hash of this scalar.
+   */
+  std::size_t hash() const {
+if (hash_cache_ == nullptr) {
+  hash_cache_ = std::make_unique(computeHash());
+}
+return *hash_cache_;
--- End diff --

Also `std::hash` returns `std::size_t`, so it may be more consistent to use 
`std::size_t` as hash value type here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #237: QUICKSTEP-89 Add support for common s...

2017-04-24 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/237#discussion_r113070668
  
--- Diff: query_optimizer/expressions/Scalar.hpp ---
@@ -65,10 +67,49 @@ class Scalar : public Expression {
   const std::unordered_map<ExprId, const CatalogAttribute*>& 
substitution_map)
   const = 0;
 
+  /**
+   * @brief Check whether this scalar is semantically equivalent to \p 
other.
+   *
+   * @note The fact that two scalars are semantically equal brings more
+   *   optimization opportunities, e.g. common subexpression 
elimination.
+   *   Meanwhile, it is always safe to assume that two scalars are not 
equal.
+   *
+   * @return True if this scalar equals \p other; false otherwise.
+   */
+  virtual bool equals(const ScalarPtr ) const {
+return false;
+  }
+
+  /**
+   * @brief Get a hash of this scalar.
+   *
+   * @return A hash of this scalar.
+   */
+  std::size_t hash() const {
+if (hash_cache_ == nullptr) {
+  hash_cache_ = std::make_unique(computeHash());
+}
+return *hash_cache_;
--- End diff --

There is no performance consideration here and the pointer makes it clear 
for the "null" case, since `0` itself can be a valid hash value.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


  1   2   3   >