[GitHub] [incubator-doris] kangkaisen commented on pull request #3703: [Load] Add more metric to trace the time cost in stream load and make brpc_num_threads configurable

2020-05-29 Thread GitBox


kangkaisen commented on pull request #3703:
URL: https://github.com/apache/incubator-doris/pull/3703#issuecomment-636276709


   @caiconghui Hi, If you only change the webserver_num_workers from 5 to 48, 
how many times stream load performance will improve?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] kangkaisen closed issue #3530: Group by query will core in Debug mode BE

2020-05-29 Thread GitBox


kangkaisen closed issue #3530:
URL: https://github.com/apache/incubator-doris/issues/3530


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] kangkaisen commented on a change in pull request #3722: [Bug] Fix bug that runningprofile show time problem in FE web page and add the runingprofile doc

2020-05-29 Thread GitBox


kangkaisen commented on a change in pull request #3722:
URL: https://github.com/apache/incubator-doris/pull/3722#discussion_r432806386



##
File path: docs/zh-CN/administrator-guide/running-profile.md
##
@@ -0,0 +1,151 @@
+---
+{
+"title": "查询执行的统计",
+"language": "zh-CN"
+}
+---
+
+
+
+# 查询执行的统计
+
+本文档主要介绍Doris在查询执行的统计结果。利用这些统计的信息,可以更好的帮助我们了解Doris的执行情况,并有针对性的进行相应**Debug与调优工作**。
+
+
+## 名词解释
+
+* FE:Frontend,Doris 的前端节点。负责元数据管理和请求接入。
+* BE:Backend,Doris 的后端节点。负责查询执行和数据存储。
+* Fragment:FE会将具体的SQL语句的执行转化为对应的Fragment并下发到BE进行执行。BE上执行对应Fragment,并将结果汇聚返回给FE。
+
+## 基本原理
+
+FE将查询计划拆分成为Fragment下发到BE进行任务执行。BE在执行Fragment时记录了**运行状态时的统计值**,并将Fragment执行的统计信息输出到日志之中。
 FE也可以通过开关将各个Fragment记录的这些统计值进行搜集,并在FE的Web页面上打印结果。
+
+## 操作流程
+
+通过Mysql命令,将FE上的Report的开关打开
+
+```
+mysql> set is_report_success=true; 
+```
+
+之后执行对应的SQL语句之后,在FE的Web页面就可以看到对应SQL语句执行的Report信息:
+![image.png](https://upload-images.jianshu.io/upload_images/8552201-f5308be377dc4d90.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
+
+这里会列出最新执行完成的**100条语句**,我们可以通过Profile查看详细的统计信息。
+```
+Query:
+  Summary:
+Query ID: 9664061c57e84404-85ae111b8ba7e83a
+Start Time: 2020-05-02 10:34:57
+End Time: 2020-05-02 10:35:08
+Total: 10s323ms
+Query Type: Query
+Query State: EOF
+Doris Version: trunk
+User: root
+Default Db: default_cluster:test
+Sql Statement: select max(Bid_Price) from quotes group by Symbol
+```
+这里详尽的列出了**查询的ID,执行时间,执行语句**等等的总结信息。接下来内容是打印从BE收集到的各个Fragement的详细信息。
+ ```
+Fragment 0:
+  Instance 9664061c57e84404-85ae111b8ba7e83d 
(host=TNetworkAddress(hostname:10.144.192.47, port:9060)):(Active: 10s270ms, % 
non-child: 0.14%)
+ - MemoryLimit: 2.00 GB
+ - BytesReceived: 168.08 KB
+ - PeakUsedReservation: 0.00 
+ - SendersBlockedTimer: 0ns
+ - DeserializeRowBatchTimer: 501.975us
+ - PeakMemoryUsage: 577.04 KB
+ - RowsProduced: 8.322K (8322)
+EXCHANGE_NODE (id=4):(Active: 10s256ms, % non-child: 99.35%)
+   - ConvertRowBatchTime: 180.171us
+   - PeakMemoryUsage: 0.00 
+   - RowsReturned: 8.322K (8322)
+   - MemoryUsed: 0.00 
+   - RowsReturnedRate: 811
+```
+这里列出了Fragment的ID;```hostname```指的是执行Fragment的BE节点;```Active:10s270ms```表示该节点的执行总时间;```non-child:
 0.14%```表示除了执行节点自身的执行时间,不包含子节点的执行时间。后续依次打印子节点的统计信息,**这里可以通过缩进区分节点之间的父子关系**。

Review comment:
   ```suggestion
   
这里列出了Fragment的ID;```hostname```指的是执行Fragment的BE节点;```Active:10s270ms```表示该节点的执行总时间;```non-child:
 0.14%```表示该节点自身的执行时间,不包含子节点的执行时间。后续依次打印子节点的统计信息,**这里可以通过缩进区分节点之间的父子关系**。
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] morningman commented on a change in pull request #3724: Fix large string val allocation failure

2020-05-29 Thread GitBox


morningman commented on a change in pull request #3724:
URL: https://github.com/apache/incubator-doris/pull/3724#discussion_r432806370



##
File path: be/src/exprs/bitmap_function.cpp
##
@@ -488,7 +488,8 @@ StringVal 
BitmapFunctions::bitmap_from_string(FunctionContext* ctx, const String
 }
 
 std::vector bits;
-if (!SplitStringAndParse({(const char*)input.ptr, input.len}, ",", 
_strtou64, )) {
+// TODO(hkp): I think StringPiece's len should also be uint64_t

Review comment:
   Could you explain why not changing this to int64? You can just add 
comment here





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] morningman commented on pull request #3723: Fix UT ThreadPoolManagerTest failure

2020-05-29 Thread GitBox


morningman commented on pull request #3723:
URL: https://github.com/apache/incubator-doris/pull/3723#issuecomment-636268081


   > 700 is larger the task sleep time 500.
   
   I see, thant u~



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] morningman commented on pull request #3731: [Bug][FsBroker] NPE throw when username is empty

2020-05-29 Thread GitBox


morningman commented on pull request #3731:
URL: https://github.com/apache/incubator-doris/pull/3731#issuecomment-636267883


   > empty user_name is valid?
   
   Not valid, but is OK.
   This is just a self self-protection.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] kangkaisen commented on pull request #3731: [Bug][FsBroker] NPE throw when username is empty

2020-05-29 Thread GitBox


kangkaisen commented on pull request #3731:
URL: https://github.com/apache/incubator-doris/pull/3731#issuecomment-636266561


   empty user_name is valid?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] kangkaisen commented on pull request #3723: Fix UT ThreadPoolManagerTest failure

2020-05-29 Thread GitBox


kangkaisen commented on pull request #3723:
URL: https://github.com/apache/incubator-doris/pull/3723#issuecomment-636265936


   > > ```
   > > java.lang.AssertionError: expected:<0> but was:<1>
   > >  at 
org.apache.doris.common.ThreadPoolManagerTest.testNormal(ThreadPoolManagerTest.java:64)
   > > -
   > > ```
   > 
   > Could you explain why 700 is ok?
   
   700 is larger the task sleep time 500.
   
   @chaoyli  Hi, If there are some questions about this PR, you shouldn't merge 
it.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] chaoyli merged pull request #3690: [compaction] Update cumulative point calculate algorithm

2020-05-29 Thread GitBox


chaoyli merged pull request #3690:
URL: https://github.com/apache/incubator-doris/pull/3690


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] branch master updated (7524c5e -> 43d25af)

2020-05-29 Thread lichaoyong
This is an automated email from the ASF dual-hosted git repository.

lichaoyong pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git.


from 7524c5e  [Memory Engine] Add MemSubTablet, MemTablet, WriteTx, 
PartialRowBatch (#3637)
 add 43d25af  [compaction] Update cumulative point calculate algorithm 
(#3690)

No new revisions were added by this update.

Summary of changes:
 be/src/olap/tablet.cpp | 12 ++--
 1 file changed, 2 insertions(+), 10 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] branch master updated: Fix UT ThreadPoolManagerTest failure (#3723)

2020-05-29 Thread lichaoyong
This is an automated email from the ASF dual-hosted git repository.

lichaoyong pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git


The following commit(s) were added to refs/heads/master by this push:
 new 5cb4063  Fix UT ThreadPoolManagerTest failure (#3723)
5cb4063 is described below

commit 5cb4063904960ea922a2bd1a04046db3e90220c5
Author: Binglin Chang 
AuthorDate: Sat May 30 10:35:07 2020 +0800

Fix UT ThreadPoolManagerTest failure (#3723)
---
 fe/src/test/java/org/apache/doris/common/ThreadPoolManagerTest.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/fe/src/test/java/org/apache/doris/common/ThreadPoolManagerTest.java 
b/fe/src/test/java/org/apache/doris/common/ThreadPoolManagerTest.java
index ed26f26..2ae2ea9 100755
--- a/fe/src/test/java/org/apache/doris/common/ThreadPoolManagerTest.java
+++ b/fe/src/test/java/org/apache/doris/common/ThreadPoolManagerTest.java
@@ -58,7 +58,7 @@ public class ThreadPoolManagerTest {
 Assert.assertEquals(0, testCachedPool.getQueue().size());
 Assert.assertEquals(0, testCachedPool.getCompletedTaskCount());
 
-Thread.sleep(500);
+Thread.sleep(700);
 
 Assert.assertEquals(2, testCachedPool.getPoolSize());
 Assert.assertEquals(0, testCachedPool.getActiveCount());


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] chaoyli merged pull request #3723: Fix UT ThreadPoolManagerTest failure

2020-05-29 Thread GitBox


chaoyli merged pull request #3723:
URL: https://github.com/apache/incubator-doris/pull/3723


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] chaoyli merged pull request #3637: [Memory Engine] Add MemSubTablet, MemTablet, WriteTx, PartialRowBatch

2020-05-29 Thread GitBox


chaoyli merged pull request #3637:
URL: https://github.com/apache/incubator-doris/pull/3637


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] branch master updated: [Memory Engine] Add MemSubTablet, MemTablet, WriteTx, PartialRowBatch (#3637)

2020-05-29 Thread lichaoyong
This is an automated email from the ASF dual-hosted git repository.

lichaoyong pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git


The following commit(s) were added to refs/heads/master by this push:
 new 7524c5e  [Memory Engine] Add MemSubTablet, MemTablet, WriteTx, 
PartialRowBatch (#3637)
7524c5e is described below

commit 7524c5ef63becb184583dae4111a19bcb0b43e22
Author: Binglin Chang 
AuthorDate: Sat May 30 10:33:10 2020 +0800

[Memory Engine] Add MemSubTablet, MemTablet, WriteTx, PartialRowBatch 
(#3637)
---
 be/src/olap/base_tablet.cpp|  13 +-
 be/src/olap/base_tablet.h  |  14 +-
 be/src/olap/memory/CMakeLists.txt  |   3 +
 be/src/olap/memory/common.h|   1 +
 be/src/olap/memory/mem_sub_tablet.cpp  | 246 ++
 be/src/olap/memory/mem_sub_tablet.h| 120 +
 be/src/olap/memory/mem_tablet.cpp  |  35 ++-
 be/src/olap/memory/mem_tablet.h|  47 +++-
 be/src/olap/memory/partial_row_batch.cpp   | 274 +
 be/src/olap/memory/partial_row_batch.h | 172 +
 be/src/olap/memory/schema.cpp  |  72 ++
 be/src/olap/memory/schema.h|  38 ++-
 .../olap/memory/{mem_tablet.cpp => write_txn.cpp}  |  16 +-
 be/src/olap/memory/{mem_tablet.h => write_txn.h}   |  35 ++-
 be/src/olap/tablet.cpp |   9 -
 be/src/olap/tablet.h   |   3 +-
 be/src/util/time.h |   4 +
 be/test/olap/CMakeLists.txt|   1 +
 be/test/olap/memory/partial_row_batch_test.cpp | 111 +
 be/test/olap/memory/schema_test.cpp|  22 +-
 20 files changed, 1199 insertions(+), 37 deletions(-)

diff --git a/be/src/olap/base_tablet.cpp b/be/src/olap/base_tablet.cpp
index 41aa93a..368f5ed 100644
--- a/be/src/olap/base_tablet.cpp
+++ b/be/src/olap/base_tablet.cpp
@@ -15,7 +15,9 @@
 // specific language governing permissions and limitations
 // under the License.
 
-#include "base_tablet.h"
+#include "olap/base_tablet.h"
+#include "util/path_util.h"
+#include "olap/data_dir.h"
 
 namespace doris {
 
@@ -24,6 +26,7 @@ BaseTablet::BaseTablet(TabletMetaSharedPtr tablet_meta, 
DataDir* data_dir) :
 _tablet_meta(tablet_meta),
 _schema(tablet_meta->tablet_schema()),
 _data_dir(data_dir) {
+_gen_tablet_path();
 }
 
 BaseTablet::~BaseTablet() {
@@ -40,4 +43,12 @@ OLAPStatus BaseTablet::set_tablet_state(TabletState state) {
 return OLAP_SUCCESS;
 }
 
+void BaseTablet::_gen_tablet_path() {
+std::string path = _data_dir->path() + DATA_PREFIX;
+path = path_util::join_path_segments(path, 
std::to_string(_tablet_meta->shard_id()));
+path = path_util::join_path_segments(path, 
std::to_string(_tablet_meta->tablet_id()));
+path = path_util::join_path_segments(path, 
std::to_string(_tablet_meta->schema_hash()));
+_tablet_path = path;
+}
+
 } /* namespace doris */
diff --git a/be/src/olap/base_tablet.h b/be/src/olap/base_tablet.h
index 34020eb..f3b0c2d 100644
--- a/be/src/olap/base_tablet.h
+++ b/be/src/olap/base_tablet.h
@@ -19,12 +19,15 @@
 #define DORIS_BE_SRC_OLAP_BASE_TABLET_H
 
 #include 
+
 #include "olap/olap_define.h"
 #include "olap/tablet_meta.h"
 #include "olap/utils.h"
 
 namespace doris {
 
+class DataDir;
+
 // Base class for all tablet classes, currently only olap/Tablet and
 // olap/memory/MemTablet.
 // The fields and methods in this class is not final, it will change as memory
@@ -57,10 +60,13 @@ public:
 inline void set_creation_time(int64_t creation_time);
 inline bool equal(int64_t tablet_id, int32_t schema_hash);
 
-// propreties encapsulated in TabletSchema
+// properties encapsulated in TabletSchema
 inline const TabletSchema& tablet_schema() const;
 
 protected:
+void _gen_tablet_path();
+
+protected:
 TabletState _state;
 TabletMetaSharedPtr _tablet_meta;
 TabletSchema _schema;
@@ -72,7 +78,6 @@ private:
 DISALLOW_COPY_AND_ASSIGN(BaseTablet);
 };
 
-
 inline DataDir* BaseTablet::data_dir() const {
 return _data_dir;
 }
@@ -99,9 +104,8 @@ inline int64_t BaseTablet::table_id() const {
 
 inline const std::string BaseTablet::full_name() const {
 std::stringstream ss;
-ss << _tablet_meta->tablet_id()
-   << "." << _tablet_meta->schema_hash()
-   << "." << _tablet_meta->tablet_uid().to_string();
+ss << _tablet_meta->tablet_id() << "." << _tablet_meta->schema_hash() << 
"."
+   << _tablet_meta->tablet_uid().to_string();
 return ss.str();
 }
 
diff --git a/be/src/olap/memory/CMakeLists.txt 
b/be/src/olap/memory/CMakeLists.txt
index 9de9095..b552dfe 100644
--- a/be/src/olap/memory/CMakeLists.txt
+++ b/be/src/olap/memory/CMakeLists.txt
@@ -29,5 +29,8 @@ add_library(Memory STATIC
 

[GitHub] [incubator-doris] morningman commented on a change in pull request #3716: [Spark load] Fe submit spark etl job

2020-05-29 Thread GitBox


morningman commented on a change in pull request #3716:
URL: https://github.com/apache/incubator-doris/pull/3716#discussion_r432800096



##
File path: fe/src/main/java/org/apache/doris/catalog/OlapTable.java
##
@@ -530,6 +530,30 @@ public KeysType getKeysType() {
 return keysType;
 }
 
+public KeysType getKeysTypeByIndexId(long indexId) {

Review comment:
   There is `keysType` is `MaterializedIndexMeta`. You can get it directly.

##
File path: fe/src/main/java/org/apache/doris/common/util/BrokerUtil.java
##
@@ -17,52 +17,59 @@
 
 package org.apache.doris.common.util;
 
-import com.google.common.collect.Lists;
 import org.apache.doris.analysis.BrokerDesc;
 import org.apache.doris.catalog.Catalog;
 import org.apache.doris.catalog.FsBroker;
 import org.apache.doris.common.AnalysisException;
 import org.apache.doris.common.ClientPool;
+import org.apache.doris.common.Config;
 import org.apache.doris.common.UserException;
 import org.apache.doris.service.FrontendOptions;
+import org.apache.doris.thrift.TBrokerCloseReaderRequest;
+import org.apache.doris.thrift.TBrokerCloseWriterRequest;
+import org.apache.doris.thrift.TBrokerDeletePathRequest;
+import org.apache.doris.thrift.TBrokerFD;
 import org.apache.doris.thrift.TBrokerFileStatus;
 import org.apache.doris.thrift.TBrokerListPathRequest;
 import org.apache.doris.thrift.TBrokerListResponse;
+import org.apache.doris.thrift.TBrokerOpenMode;
+import org.apache.doris.thrift.TBrokerOpenReaderRequest;
+import org.apache.doris.thrift.TBrokerOpenReaderResponse;
+import org.apache.doris.thrift.TBrokerOpenWriterRequest;
+import org.apache.doris.thrift.TBrokerOpenWriterResponse;
+import org.apache.doris.thrift.TBrokerOperationStatus;
 import org.apache.doris.thrift.TBrokerOperationStatusCode;
+import org.apache.doris.thrift.TBrokerPReadRequest;
+import org.apache.doris.thrift.TBrokerPWriteRequest;
+import org.apache.doris.thrift.TBrokerReadResponse;
 import org.apache.doris.thrift.TBrokerVersion;
 import org.apache.doris.thrift.TNetworkAddress;
 import org.apache.doris.thrift.TPaloBrokerService;
 
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Lists;
+
 import org.apache.logging.log4j.LogManager;
 import org.apache.logging.log4j.Logger;
 import org.apache.thrift.TException;
 
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.io.UnsupportedEncodingException;
+import java.nio.ByteBuffer;
+import java.nio.channels.FileChannel;
 import java.util.Collections;
 import java.util.List;
 
 public class BrokerUtil {
 private static final Logger LOG = LogManager.getLogger(BrokerUtil.class);
 
+private static int READ_BUFFER_SIZE = 1024 * 1024;

Review comment:
   add unit to name.

##
File path: fe/src/main/java/org/apache/doris/common/Pair.java
##
@@ -25,7 +27,9 @@
 public class Pair {
 public static PairComparator> PAIR_VALUE_COMPARATOR = 
new PairComparator<>();
 
+@SerializedName(value = "first")

Review comment:
   I'am not sure this is ok, cause there is no guarantee that the `F` and 
`S` object can also be serialized by GSON

##
File path: fe/src/main/java/org/apache/doris/load/loadv2/SparkEtlJobHandler.java
##
@@ -0,0 +1,170 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.load.loadv2;
+
+import org.apache.doris.PaloFe;
+import org.apache.doris.analysis.BrokerDesc;
+import org.apache.doris.catalog.SparkResource;
+import org.apache.doris.common.LoadException;
+import org.apache.doris.common.UserException;
+import org.apache.doris.common.util.BrokerUtil;
+import org.apache.doris.load.loadv2.etl.EtlJobConfig;
+import org.apache.doris.thrift.TEtlState;
+
+import org.apache.logging.log4j.LogManager;
+import org.apache.logging.log4j.Logger;
+import org.apache.spark.launcher.SparkAppHandle;
+import org.apache.spark.launcher.SparkAppHandle.Listener;
+import org.apache.spark.launcher.SparkAppHandle.State;
+import org.apache.spark.launcher.SparkLauncher;
+
+import java.io.IOException;
+import java.io.UnsupportedEncodingException;
+import java.util.Map;
+
+/**
+ * SparkEtlJobHandler is responsible for
+ * 1. submit spark etl job
+ * 

[GitHub] [incubator-doris] morningman commented on a change in pull request #3715: [Spark load] Fe create job

2020-05-29 Thread GitBox


morningman commented on a change in pull request #3715:
URL: https://github.com/apache/incubator-doris/pull/3715#discussion_r432798579



##
File path: fe/src/main/java/org/apache/doris/common/Config.java
##
@@ -491,6 +491,12 @@
 @ConfField(mutable = true, masterOnly = true)
 public static int hadoop_load_default_timeout_second = 86400 * 3; // 3 day
 
+/*
+ * Default spark load timeout
+ */
+@ConfField(mutable = true, masterOnly = true)
+public static int spark_load_default_timeout_second = 86400 * 3; // 3 days

Review comment:
   I think 1 day is long enough~

##
File path: fe/src/main/java/org/apache/doris/load/loadv2/SparkLoadJob.java
##
@@ -0,0 +1,249 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.load.loadv2;
+
+import com.google.common.base.Strings;
+import org.apache.doris.analysis.BrokerDesc;
+import org.apache.doris.analysis.ResourceDesc;
+import org.apache.doris.catalog.Catalog;
+import org.apache.doris.catalog.Resource;
+import org.apache.doris.catalog.SparkResource;
+import org.apache.doris.common.Config;
+import org.apache.doris.common.DdlException;
+import org.apache.doris.common.MetaNotFoundException;
+import org.apache.doris.common.Pair;
+import org.apache.doris.common.io.Text;
+import org.apache.doris.load.EtlJobType;
+import org.apache.doris.load.FailMsg;
+import org.apache.doris.qe.OriginStatement;
+import org.apache.doris.task.AgentTaskQueue;
+import org.apache.doris.task.PushTask;
+
+import org.apache.logging.log4j.LogManager;
+import org.apache.logging.log4j.Logger;
+import org.apache.spark.launcher.SparkAppHandle;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Maps;
+import com.google.common.collect.Sets;
+
+import java.io.DataInput;
+import java.io.DataOutput;
+import java.io.IOException;
+import java.util.Map;
+import java.util.Set;
+
+/**
+ * There are 4 steps in SparkLoadJob:
+ * Step1: SparkLoadPendingTask will be created by unprotectedExecuteJob method 
and submit spark etl job.
+ * Step2: LoadEtlChecker will check spark etl job status periodly and send 
push tasks to be when spark etl job is finished.
+ * Step3: LoadLoadingChecker will check loading status periodly and commit 
transaction when push tasks are finished.
+ * Step4: PublishVersionDaemon will send publish version tasks to be and 
finish transaction.
+ */
+public class SparkLoadJob extends BulkLoadJob {
+private static final Logger LOG = LogManager.getLogger(SparkLoadJob.class);
+
+// for global dict
+public static final String BITMAP_DATA_PROPERTY = "bitmap_data";

Review comment:
   this property is hard to understand and is coupled with the detail 
implementation of the global dict.
   How about changing it to a more abstract nouns?

##
File path: fe/src/main/java/org/apache/doris/load/loadv2/BulkLoadJob.java
##
@@ -0,0 +1,328 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.load.loadv2;
+
+import org.apache.doris.analysis.BrokerDesc;
+import org.apache.doris.analysis.DataDescription;
+import org.apache.doris.analysis.LoadStmt;
+import org.apache.doris.analysis.SqlParser;
+import org.apache.doris.analysis.SqlScanner;
+import org.apache.doris.catalog.AuthorizationInfo;
+import org.apache.doris.catalog.Catalog;
+import org.apache.doris.catalog.Database;
+import org.apache.doris.catalog.Table;
+import 

[GitHub] [incubator-doris] wyb commented on a change in pull request #3715: [Spark load] Fe create job

2020-05-29 Thread GitBox


wyb commented on a change in pull request #3715:
URL: https://github.com/apache/incubator-doris/pull/3715#discussion_r432797917



##
File path: fe/src/main/java/org/apache/doris/analysis/ResourceDesc.java
##
@@ -0,0 +1,121 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.analysis;
+
+import com.google.common.collect.Maps;
+import org.apache.doris.catalog.Catalog;
+import org.apache.doris.catalog.Resource;
+import org.apache.doris.common.AnalysisException;
+import org.apache.doris.common.io.Text;
+import org.apache.doris.common.io.Writable;
+import org.apache.doris.common.util.PrintableMap;
+import org.apache.doris.load.EtlJobType;
+
+import java.io.DataInput;
+import java.io.DataOutput;
+import java.io.IOException;
+import java.util.Map;
+
+// Resource descriptor
+//
+// Spark example:
+// WITH RESOURCE "spark0"
+// (
+//   "spark.jars" = "xxx.jar,yyy.jar",
+//   "spark.files" = "/tmp/aaa,/tmp/bbb",
+//   "spark.executor.memory" = "1g",
+//   "spark.yarn.queue" = "queue0"
+// )
+public class ResourceDesc implements Writable {
+protected String name;
+protected Map properties;
+protected EtlJobType etlJobType;
+
+// Only used for recovery
+private ResourceDesc() {
+}
+
+public ResourceDesc(String name, Map properties) {
+this.name = name;
+this.properties = properties;
+if (this.properties == null) {
+this.properties = Maps.newHashMap();
+}
+this.etlJobType = EtlJobType.UNKNOWN;
+}
+
+public String getName() {
+return name;
+}
+
+public Map getProperties() {
+return properties;
+}
+
+public EtlJobType getEtlJobType() {
+return etlJobType;
+}
+
+public void analyze() throws AnalysisException {
+// check resource exist or not
+Resource resource = 
Catalog.getCurrentCatalog().getResourceMgr().getResource(getName());
+if (resource == null) {
+throw new AnalysisException("Resource does not exist. name: " + 
getName());
+}
+if (resource.getType() == Resource.ResourceType.SPARK) {
+etlJobType = EtlJobType.SPARK;
+}
+}
+
+@Override
+public void write(DataOutput out) throws IOException {

Review comment:
   I remove this serialization, because it is not used





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] wyb commented on a change in pull request #3715: [Spark load] Fe create job

2020-05-29 Thread GitBox


wyb commented on a change in pull request #3715:
URL: https://github.com/apache/incubator-doris/pull/3715#discussion_r432797382



##
File path: fe/src/main/java/org/apache/doris/analysis/BrokerDesc.java
##
@@ -17,61 +17,36 @@
 
 package org.apache.doris.analysis;
 
-import org.apache.doris.common.io.Text;
-import org.apache.doris.common.io.Writable;
+import org.apache.doris.common.AnalysisException;
+import org.apache.doris.load.EtlJobType;
 import org.apache.doris.common.util.PrintableMap;
 
-import com.google.common.collect.Maps;
-
 import java.io.DataInput;
-import java.io.DataOutput;
 import java.io.IOException;
 import java.util.Map;
 
 // Broker descriptor
-public class BrokerDesc implements Writable {
-private String name;
-private Map properties;
-
+//
+// Broker example:
+// WITH BROKER "broker0"
+// (
+//   "username" = "user0",
+//   "password" = "password0"
+// )
+public class BrokerDesc extends ResourceDesc {

Review comment:
   ok





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] wyb commented on a change in pull request #3715: [Spark load] Fe create job

2020-05-29 Thread GitBox


wyb commented on a change in pull request #3715:
URL: https://github.com/apache/incubator-doris/pull/3715#discussion_r432797335



##
File path: fe/src/main/java/org/apache/doris/load/loadv2/JobState.java
##
@@ -21,6 +21,7 @@
 public enum JobState {
 UNKNOWN, // this is only for ISSUE #2354
 PENDING, // init state
+ETL, // load data partition, sort and aggregation with etl cluster

Review comment:
   JobState will be persisted in meta data by name, so the order of these 
state is not important





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] wyb edited a comment on issue #3433: [Spark load] Doris support Spark load

2020-05-29 Thread GitBox


wyb edited a comment on issue #3433:
URL: 
https://github.com/apache/incubator-doris/issues/3433#issuecomment-621142283


   **Design doc**
   #2855 [Proposal] support spark load 
   #2887 [Proposal] Support Spark Convert Doris Segment 
   #3010 Spark load interface 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] wyb edited a comment on issue #3433: [Spark load] Doris support Spark load

2020-05-29 Thread GitBox


wyb edited a comment on issue #3433:
URL: 
https://github.com/apache/incubator-doris/issues/3433#issuecomment-635180109


   **Resource manager**
   #3418 [Spark load] Add resource manager (Merged)
   
   **Fe schedule job execution**
   #3712 [Spark load] Add spark etl job config
   #3718 [Spark load] Update push task thrift interface 
   #3715 [Spark load] Fe create job 
   #3716 [Spark load] Fe submit spark etl job 
   #3717 [Spark load] Fe process etl and loading state job 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] wyb edited a comment on issue #3433: [Spark load] Doris support Spark load

2020-05-29 Thread GitBox


wyb edited a comment on issue #3433:
URL: 
https://github.com/apache/incubator-doris/issues/3433#issuecomment-621142283


   design doc
   #2855 [Proposal] support spark load 
   #2887 [Proposal] Support Spark Convert Doris Segment 
   #3010 Spark load interface 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] wyb edited a comment on issue #3433: [Spark load] Doris support Spark load

2020-05-29 Thread GitBox


wyb edited a comment on issue #3433:
URL: 
https://github.com/apache/incubator-doris/issues/3433#issuecomment-635180109


   #3418 [Spark load] Add resource manager (Merged)
   
   #3712 [Spark load] Add spark etl job config
   #3718 [Spark load] Update push task thrift interface 
   #3715 [Spark load] Fe create job 
   #3716 [Spark load] Fe submit spark etl job 
   #3717 [Spark load] Fe process etl and loading state job 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] wyb removed a comment on issue #3433: [Spark load] Doris support Spark load

2020-05-29 Thread GitBox


wyb removed a comment on issue #3433:
URL: 
https://github.com/apache/incubator-doris/issues/3433#issuecomment-635310967







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] wyb edited a comment on issue #3433: [Spark load] Doris support Spark load

2020-05-29 Thread GitBox


wyb edited a comment on issue #3433:
URL: 
https://github.com/apache/incubator-doris/issues/3433#issuecomment-635180109


   #3418 [Spark load] Add resource manager (finished)
   
   #3712 [Spark load] Add spark etl job config
   #3718 [Spark load] Update push task thrift interface 
   #3715 [Spark load] Fe create job 
   #3716 [Spark load] Fe submit spark etl job 
   #3717 [Spark load] Fe process etl and loading state job 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] wyb removed a comment on issue #3433: [Spark load] Doris support Spark load

2020-05-29 Thread GitBox


wyb removed a comment on issue #3433:
URL: 
https://github.com/apache/incubator-doris/issues/3433#issuecomment-635181748


   Add resource manager #3418 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] morningman commented on pull request #3604: (#3464) fix Query failed when fact table has no data in join case

2020-05-29 Thread GitBox


morningman commented on pull request #3604:
URL: https://github.com/apache/incubator-doris/pull/3604#issuecomment-635962189


   > I think we can set `partition join` as default when we're unable to 
estimate the cost accurately and broadcast join cost equals partition join cost 
and without user hint.
   
   This LGTM.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] morningman commented on pull request #3723: Fix UT ThreadPoolManagerTest failure

2020-05-29 Thread GitBox


morningman commented on pull request #3723:
URL: https://github.com/apache/incubator-doris/pull/3723#issuecomment-635959969


   > ```
   > java.lang.AssertionError: expected:<0> but was:<1>
   >at 
org.apache.doris.common.ThreadPoolManagerTest.testNormal(ThreadPoolManagerTest.java:64)
   > -
   > ```
   
   Could you explain why 700 is ok?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] morningman opened a new pull request #3731: [Bug][FsBroker] NPE throw when username is empty

2020-05-29 Thread GitBox


morningman opened a new pull request #3731:
URL: https://github.com/apache/incubator-doris/pull/3731


   When using Broker with an empty username, a NPE is thrown, which is
   not expected.
   
   Fix: #3730



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] morningman opened a new issue #3730: [Bug][Broker] Null pointer exception when using broker

2020-05-29 Thread GitBox


morningman opened a new issue #3730:
URL: https://github.com/apache/incubator-doris/issues/3730


   **Describe the bug**
   When using broker with an empty username, NPE is thrown
   
   `IllegalArgumentException: Null user`
   
   **To Reproduce**
   create a broker load with broker property:
   
   ```
   ("username" = "")
   ```
   
   **Expected behavior**
   Not throw exception
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] wangbo edited a comment on issue #3433: [Spark load] Doris support Spark load

2020-05-29 Thread GitBox


wangbo edited a comment on issue #3433:
URL: 
https://github.com/apache/incubator-doris/issues/3433#issuecomment-626104019


   **Count Distinct Module**
   #3319 Support Java Verision HyperLogLog(REVIEWING)
   #3061 Doris Support Using Hive Table to Build Global Dict(TESTING)
   #3088 Support Java version 64 bits Integers for BITMAP type(MERGED)
   
   **Spark DPP Module**
   #3726 [Spark Load] Rollup Tree Builder 
   #3728 [Spark Load] Using SparkDpp to complete some calculation in Spark Load 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] wangbo opened a new pull request #3729: (#3728) Using SparkDpp to complete some calculation in Spark Load

2020-05-29 Thread GitBox


wangbo opened a new pull request #3729:
URL: https://github.com/apache/incubator-doris/pull/3729


   see #3728



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] wangbo opened a new issue #3728: [Spark Load] Using SparkDpp to complete ETL in Spark Load

2020-05-29 Thread GitBox


wangbo opened a new issue #3728:
URL: https://github.com/apache/incubator-doris/issues/3728


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] chaoyli closed issue #3442: [Memory Engine] Add MemTablet type to TabletMeta

2020-05-29 Thread GitBox


chaoyli closed issue #3442:
URL: https://github.com/apache/incubator-doris/issues/3442


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] branch master updated (93aae6b -> c967eaf)

2020-05-29 Thread lichaoyong
This is an automated email from the ASF dual-hosted git repository.

lichaoyong pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git.


from 93aae6b  [Bug] fix mixed used of counter (#3720)
 add c967eaf  [Memory Engine] Add TabletType to PartitionInfo and 
TabletMeta (#3668)

No new revisions were added by this update.

Summary of changes:
 be/src/olap/base_tablet.h  |  5 +++
 be/src/olap/tablet_meta.cpp| 29 +++---
 be/src/olap/tablet_meta.h  | 11 ++
 be/test/olap/test_data/header.txt  |  3 +-
 .../java/org/apache/doris/alter/RollupJobV2.java   |  5 ++-
 .../org/apache/doris/alter/SchemaChangeJobV2.java  |  3 +-
 .../analysis/ModifyTablePropertiesClause.java  |  2 +
 .../doris/analysis/SingleRangePartitionDesc.java   |  6 +++
 .../java/org/apache/doris/backup/RestoreJob.java   |  3 +-
 .../java/org/apache/doris/catalog/Catalog.java | 44 +++---
 .../org/apache/doris/catalog/PartitionInfo.java| 19 ++
 .../apache/doris/common/util/PropertyAnalyzer.java | 20 ++
 .../org/apache/doris/master/ReportHandler.java |  3 +-
 .../org/apache/doris/task/CreateReplicaTask.java   |  8 +++-
 .../org/apache/doris/common/util/UnitTestUtil.java |  2 +
 .../java/org/apache/doris/task/AgentTaskTest.java  |  3 +-
 gensrc/proto/olap_file.proto   |  6 +++
 gensrc/thrift/AgentService.thrift  |  6 +++
 18 files changed, 141 insertions(+), 37 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] chaoyli merged pull request #3668: [Memory Engine] Add TabletType to PartitionInfo and TabletMeta

2020-05-29 Thread GitBox


chaoyli merged pull request #3668:
URL: https://github.com/apache/incubator-doris/pull/3668


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] wangbo opened a new pull request #3727: (#3726) [Spark Load] Rollup Tree Builder

2020-05-29 Thread GitBox


wangbo opened a new pull request #3727:
URL: https://github.com/apache/incubator-doris/pull/3727


   see #3726
   
   1 A tree data structure to describe doris table's rollup
   2 A builder to build the data structure



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] wangbo opened a new issue #3726: [Spark Load] Rollup Tree Builder

2020-05-29 Thread GitBox


wangbo opened a new issue #3726:
URL: https://github.com/apache/incubator-doris/issues/3726


   In Spark Load,we want to aggregate doris table's rollup table.
   First,we need a data structure to describe the ```derivative relationship``` 
(such as ```rollup A``` can be calculated from ```rollup B```)  between doris 
tables's rollup so that we can using the minimal cost to calculate all rollup 
tables.
   
   Second, we need a builder to build the ```derivative relationship``` data 
structure.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] vagetablechicken opened a new issue #3725: Convert shouldn't core in HttpHandlers

2020-05-29 Thread GitBox


vagetablechicken opened a new issue #3725:
URL: https://github.com/apache/incubator-doris/issues/3725


   **Describe the bug**
   We use std conversion funcs, but don't catch the exception.
   
https://github.com/apache/incubator-doris/blob/93aae6bdff2e32a1a0e31c4844b99e943b172679/be/src/http/action/meta_action.cpp#L50-L51
   
   **To Reproduce**
   Steps to reproduce the behavior:
   1. visit http://be_host:port/api/meta/header/foo/bar
   1. be core
   
   **Expected behavior**
   Convert failed, then return error.
   
   Working on it.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] wangbo opened a new pull request #3063: (#3061) [Spark Load] Doris Support Using Hive Table to Build Global Dict

2020-05-29 Thread GitBox


wangbo opened a new pull request #3063:
URL: https://github.com/apache/incubator-doris/pull/3063


   #3061



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] wangbo closed pull request #3063: (#3061) [Spark Load] Doris Support Using Hive Table to Build Global Dict

2020-05-29 Thread GitBox


wangbo closed pull request #3063:
URL: https://github.com/apache/incubator-doris/pull/3063


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] EmmyMiao87 commented on pull request #3601: fix large string val allocation failure, #3600

2020-05-29 Thread GitBox


EmmyMiao87 commented on pull request #3601:
URL: https://github.com/apache/incubator-doris/pull/3601#issuecomment-635895100


   This pr has some error. The correct pr is here #3724



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] EmmyMiao87 closed pull request #3601: fix large string val allocation failure, #3600

2020-05-29 Thread GitBox


EmmyMiao87 closed pull request #3601:
URL: https://github.com/apache/incubator-doris/pull/3601


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] EmmyMiao87 opened a new pull request #3724: Fix large string val allocation failure

2020-05-29 Thread GitBox


EmmyMiao87 opened a new pull request #3724:
URL: https://github.com/apache/incubator-doris/pull/3724


   Large bitmap will need use StringVal to allocate large memory, which is 
large than MAX_INT.
   The overflow will cause serialization failure of bitmap.
   
   Fixed #3600
   
   Change-Id: I720b9ea4646188a8e1402630601928b5f16fedb2



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] decster opened a new pull request #3723: Fix UT ThreadPoolManagerTest failure

2020-05-29 Thread GitBox


decster opened a new pull request #3723:
URL: https://github.com/apache/incubator-doris/pull/3723


   ```
   java.lang.AssertionError: expected:<0> but was:<1>
at 
org.apache.doris.common.ThreadPoolManagerTest.testNormal(ThreadPoolManagerTest.java:64)
   -
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] chaoyli closed issue #3721: mix used of counter

2020-05-29 Thread GitBox


chaoyli closed issue #3721:
URL: https://github.com/apache/incubator-doris/issues/3721


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] chaoyli merged pull request #3720: [Bug] fix mixed used of counter

2020-05-29 Thread GitBox


chaoyli merged pull request #3720:
URL: https://github.com/apache/incubator-doris/pull/3720


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] branch master updated: [Bug] fix mixed used of counter (#3720)

2020-05-29 Thread lichaoyong
This is an automated email from the ASF dual-hosted git repository.

lichaoyong pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git


The following commit(s) were added to refs/heads/master by this push:
 new 93aae6b  [Bug] fix mixed used of counter (#3720)
93aae6b is described below

commit 93aae6bdff2e32a1a0e31c4844b99e943b172679
Author: lichaoyong 
AuthorDate: Fri May 29 15:36:21 2020 +0800

[Bug] fix mixed used of counter (#3720)

MysqlResultWriter _sent_rows_counter and _result_send_timer are mixed used.
It will results core dump when checking counter->type().
---
 be/src/runtime/mysql_result_writer.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/be/src/runtime/mysql_result_writer.cpp 
b/be/src/runtime/mysql_result_writer.cpp
index a0d2426..036ec4a 100644
--- a/be/src/runtime/mysql_result_writer.cpp
+++ b/be/src/runtime/mysql_result_writer.cpp
@@ -229,7 +229,7 @@ Status MysqlResultWriter::append_row_batch(const RowBatch* 
batch) {
 }
 
 if (status.ok()) {
-SCOPED_TIMER(_sent_rows_counter);
+SCOPED_TIMER(_result_send_timer);
 // push this batch to back
 status = _sinker->add_batch(result);
 


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] EmmyMiao87 opened a new issue #3344: [Proposal] Materialized View 2.0

2020-05-29 Thread GitBox


EmmyMiao87 opened a new issue #3344:
URL: https://github.com/apache/incubator-doris/issues/3344


   # The status of materialized view 1.0
   
   In the present, we have supported the materialized views in Doris 0.12 
version. The materialized view selector supports to select the most efficient 
mv and rewrite the SQL to query against the selected mv instead of the base 
table.
   For query results contain a small number of rows where the original table 
has a large amount of data, the performance can reach the 5X to 100X times 
depends on the cardinality of the data.
   The aggregate functions supported by the materialized view in 0.12 include: 
sum, min, max.
   
   However, the aggregate functions supported by the current materialized view 
are not rich enough to fully cover the user's scene.
   For example, in the `Order` scenario, user needs to analyze the number of 
orders in different dimensions.
   Another example is the `count_distinct` function is used for analyzing PV 
and UV data in website traffic.
   
   # The goal of materialized view 2.0
   
   In order to support more scenarios, the materialized view 2.0 will support 
the following functions:
   
   1. Materialized view supports aggregate functions: count, count_distinct 
(bitmap and hll)
   2. Support to create materialized views of the same column with different 
aggregate functions. For example: ```select k1, sum (v1), min (v1) from table 
group by k1```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org