[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-25 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r955507870


##
tests/cpp-runtime/hexagon/ring_buffer_tests.cc:
##
@@ -0,0 +1,109 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+

Review Comment:
   Added a unit test which overflows the HexagonUserDMA ring buffer for DMA 
descriptors.  The test is named `overflow_ring_buffer`.  Also added a return of 
`DMA_RETRY` from the HexagonUserDMA `Copy` function in case the ring buffer is 
full.  This passes the onus to the user of the class to retry the `Copy`.  The 
other way to handle it is for `Copy` to `Wait` until there is a free slot in 
the ring buffer to use.  Not sure what is best at this moment.  I am hoping the 
unit test provides the desired coverage.  Would like to add integration tests 
in a separate test.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-25 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r955489439


##
tests/cpp-runtime/hexagon/ring_buffer_tests.cc:
##
@@ -0,0 +1,109 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+

Review Comment:
   Talked with @csullivan and his point was less about the `RingBuffer` class 
and more about the HexagonUserDMA engine and whether there would be any issue 
if we are overwriting descriptors (even if they are `done`) in a HexagonUserDMA 
chain.  I.e. is it OK to overwrite the head of the descriptor chain once it is 
done?  Perhaps @kparzysz-quic or @arangasa have some comment.  A test could not 
hurt.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-25 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r955488224


##
tests/cpp-runtime/hexagon/ring_buffer_tests.cc:
##
@@ -0,0 +1,109 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+

Review Comment:
   > By the way, it's interesting you were thinking about corner cases because 
I woke up thinking about the Next vs. Add API as described above. And, I also 
realized there is a bug in the InFlight method where we might increment the 
"oldest" ID beyond the "next" ID. Will add a test case that can hit that bug 
and fix it in the next commit.
   
   This is now fixed.
   
   > There is one issue with this flow that I thought of ... the DMA descriptor 
at ring buffer slot 0 is "finished" and we simply return a pointer to it and 
expect that the user will add an "inflight" DMA descriptor at that location. If 
the user were to call InFlight before adding the "inflight" descriptor then 
ring buffer slot 0 would be treated as "finished" from the perspective of the 
RingBuffer class where it is still "inflight" from the perspective of the user.
   
   I decided to leave this as-is and encoded this corner case in a test called 
`wrap_corner`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-25 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r955269061


##
tests/cpp-runtime/hexagon/ring_buffer_tests.cc:
##
@@ -0,0 +1,109 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+

Review Comment:
   By the way, it's interesting you were thinking about corner cases because I 
woke up thinking about the `Next` vs. `Add` API as described above.  And, I 
also realized there is a bug in the `InFlight` method where we might increment 
the "oldest" ID beyond the "next" ID.  Will add a test case that can hit that 
bug and fix it in the next commit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-25 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r955267553


##
tests/cpp-runtime/hexagon/ring_buffer_tests.cc:
##
@@ -0,0 +1,109 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+

Review Comment:
   I would change your example slightly to make the descriptor IDs zero based.  
This is how it works in the code.  The first descriptor has ID=0 and uses ring 
buffer slot 0.  The second has ID=1 and uses slot 1.  When ID is equivalent to 
ring buffer size we reuse ring buffer slot 0.  And so on.
   
   ```
   Ringbuffer[size=4]
   
   Initial state:
   desc0@0[finished] , desc1@1[finished] , desc2@2[finished] , desc3@3[finished]
   
   New DMA needed:
   desc4@0[inflight] , desc1@1[finished] , desc2@2[finished] , desc3@3[finished]
   ```
   
   The case you have described above should work OK.  I believe we can cover it 
with `RingBuffer` unit tests.  I believe existing tests already hit this case, 
to some degree.  I will add a test case to explicitly cover this scenario.
   
   When we go to add `desc4` we first check if the ring buffer is full by 
checking that `InFlight` count (0) is less than the size of the ring buffer 
(4).  All good.  Then, we `GetAddr` of the "next" ID = 4 which reuses ring 
buffer slot 0.  Then, we increment the "next" ID to 5.  Then we return a 
pointer to ring buffer slot 0.
   
   There is one issue with this flow that I thought of ... the DMA descriptor 
at ring buffer slot 0 is "finished" and we simply return a pointer to it and 
expect that the user will add an "inflight" DMA descriptor at that location.  
If the user were to call `InFlight` before adding the "inflight" descriptor 
then ring buffer slot 0 would be treated as "finished" from the perspective of 
the `RingBuffer` class where it is still "inflight" from the perspective of the 
user.
   
   I had initially coded an `Add` function which took a function pointer which 
would add a T to the RingBuffer rather that `Next` to get around this issue.  
It felt to complicated but it does expose a gap.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-24 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r954407081


##
src/runtime/hexagon/hexagon_user_dma.cc:
##
@@ -82,12 +95,13 @@ int HexagonUserDMA::Copy(void* dst, void* src, uint32_t 
length) {
 dmstart(dma_desc);
 first_dma_ = false;
   } else {
-// `dmlink` descriptor to tail
-dmlink(dma_descriptors_.back(), dma_desc);
+// `dmlink` descriptor to tail descriptor
+void* tail = GetDescriptorAddr(id_next_dma_desc_ - 1);
+dmlink(tail, dma_desc);
   }
 
-  // set descriptor as new tail
-  dma_descriptors_.push_back(dma_desc);
+  // update the ID of the next DMA descriptor
+  id_next_dma_desc_++;

Review Comment:
   Added new RingBuffer class along with testing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-24 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r954406986


##
tests/cpp-runtime/hexagon/hexagon_user_dma_tests.cc:
##
@@ -57,9 +57,9 @@ TEST_F(HexagonUserDMATest, wait) {
   HexagonUserDMA::Get().Wait(10);
 }
 

Review Comment:
   Added new RingBuffer class along with testing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-23 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r952910319


##
tests/cpp-runtime/hexagon/hexagon_user_dma_tests.cc:
##
@@ -0,0 +1,141 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include 
+
+#include "../src/runtime/hexagon/hexagon_user_dma.h"
+
+using namespace tvm::runtime;
+using namespace tvm::runtime::hexagon;
+
+class HexagonUserDMATest : public ::testing::Test {
+  void SetUp() override {
+src = malloc(length);
+dst = malloc(length);
+ASSERT_NE(src, nullptr);
+ASSERT_NE(dst, nullptr);
+
+src_char = static_cast(src);
+dst_char = static_cast(dst);
+for (uint32_t i = 0; i < length; ++i) {
+  src_char[i] = 1;
+  dst_char[i] = 0;
+}
+  }
+  void TearDown() override {
+free(src);
+free(dst);
+  }
+
+ public:
+  int ret{0};
+  void* src{nullptr};
+  void* dst{nullptr};
+  char* src_char{nullptr};
+  char* dst_char{nullptr};
+  uint32_t length{0x40};  // 4MB
+};
+
+TEST_F(HexagonUserDMATest, wait) {
+  HexagonUserDMA::Get().Wait(0);
+  HexagonUserDMA::Get().Wait(10);
+}
+
+TEST_F(HexagonUserDMATest, poll) { ASSERT_EQ(HexagonUserDMA::Get().Poll(), 0); 
}
+
+TEST_F(HexagonUserDMATest, bad_copy) {
+  uint64_t bigaddr = 0x1;
+  void* src64 = reinterpret_cast(bigaddr);
+  void* dst64 = reinterpret_cast(bigaddr);
+  uint32_t biglength = 0x100;
+  ASSERT_NE(HexagonUserDMA::Get().Copy(dst64, src, length), DMA_SUCCESS);
+  ASSERT_NE(HexagonUserDMA::Get().Copy(dst, src64, length), DMA_SUCCESS);
+  ASSERT_NE(HexagonUserDMA::Get().Copy(dst, src, biglength), DMA_SUCCESS);
+}
+
+TEST_F(HexagonUserDMATest, sync_dma) {
+  // kick off 1 DMA
+  ret = HexagonUserDMA::Get().Copy(dst, src, length);
+  ASSERT_EQ(ret, DMA_SUCCESS);
+
+  // wait for DMA to complete
+  HexagonUserDMA::Get().Wait(0);
+
+  // verify
+  for (uint32_t i = 0; i < length; ++i) {
+ASSERT_EQ(src_char[i], dst_char[i]);
+  }
+}
+
+TEST_F(HexagonUserDMATest, async_dma) {
+  // kick off 10x duplicate DMAs
+  for (uint32_t i = 0; i < 10; ++i) {
+ret = HexagonUserDMA::Get().Copy(dst, src, length);
+ASSERT_EQ(ret, DMA_SUCCESS);
+  }
+
+  // verify at least 1 DMA in flight
+  // TODO: re-enable when CI runs on hardware - fails on simulator
+  // ASSERT_GT(HexagonUserDMA::Get().Poll(), 0);

Review Comment:
   Added a task to the backlog to figure out nightly testing of async behavior 
on hardware.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-22 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r951975770


##
src/runtime/hexagon/hexagon_user_dma.h:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include 
+
+#ifndef TVM_RUNTIME_HEXAGON_HEXAGON_USER_DMA_H_
+#define TVM_RUNTIME_HEXAGON_HEXAGON_USER_DMA_H_
+
+namespace tvm {
+namespace runtime {
+namespace hexagon {
+
+#define DMA_SUCCESS 0
+#define DMA_FAILURE -1
+
+class HexagonUserDMA {
+ public:
+  /*!
+   * \brief Initiate DMA to copy memory from source to destination address
+   * \param dst Destination address
+   * \param src Source address
+   * \param length Length in bytes to copy
+   * \returns Status, either DMA_SUCCESS or DMA_FAILURE
+   */
+  int Copy(void* dst, void* src, uint32_t length);
+
+  /*!
+   * \brief Wait until the number of DMAs in flight is less than or equal to 
some maximum
+   * \param max_dmas_in_flight Maximum number of DMAs allowed to be in flight
+   * to satisfy the `Wait` e.g. use `Wait(0)` to wait on "all" outstanding 
DMAs to complete
+   */
+  void Wait(uint32_t max_dmas_in_flight);
+
+  /*!
+   * \brief Poll the number of DMAs in flight
+   * \returns Number of DMAs in flight
+   */
+  uint32_t Poll();
+
+  //! HexagonUserDMA uses the singleton pattern
+  static HexagonUserDMA& Get() {
+static HexagonUserDMA* hud = new HexagonUserDMA();
+return *hud;
+  }
+
+ private:
+  HexagonUserDMA() = default;
+  ~HexagonUserDMA();
+  HexagonUserDMA(const HexagonUserDMA&) = delete;
+  HexagonUserDMA& operator=(const HexagonUserDMA&) = delete;
+  HexagonUserDMA(HexagonUserDMA&&) = delete;
+  HexagonUserDMA& operator=(HexagonUserDMA&&) = delete;
+
+  //! \brief Initializes / resets the Hexagon User DMA engine
+  unsigned int Init();
+
+  //! \brief Calculates the number of DMAs in flight
+  uint32_t DMAsInFlight();
+
+  //! \brief Stores descriptors for all DMAs
+  std::vector dma_descriptors_;

Review Comment:
   Marking this resolved with the latest commit which will `use ring buffer to 
store DMA descriptors`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-19 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r950496671


##
src/runtime/hexagon/hexagon_user_dma.cc:
##
@@ -90,32 +78,80 @@ int hexagon_user_dma_1d_sync_helper(void* dst, void* src, 
uint32_t length) {
   dma_desc_set_src(dma_desc, src32);
   dma_desc_set_dst(dma_desc, dst32);
 
-  dmstart(dma_desc);
-  unsigned int status = dmwait() & DM0_STATUS_MASK;
-  unsigned int done = dma_desc_get_done(dma_desc);
+  // only for first DMA
+  if (first_dma_) {
+// reset DMA engine
+auto status = Init();
+if (status != DM0_STATUS_IDLE) {
+  return DMA_FAILURE;
+}
+
+// `dmstart` first descriptor
+dmstart(dma_desc);
+first_dma_ = false;
+  } else {
+// `dmlink` descriptor to tail
+dmlink(dma_descriptors_.back(), dma_desc);
+  }
 
-  free(dma_desc);
+  // set descriptor as new tail
+  dma_descriptors_.push_back(dma_desc);
 
-  if (status == DM0_STATUS_IDLE && done == DESC_DONE_COMPLETE) {
-return DMA_SUCCESS;
+  return DMA_SUCCESS;
+}
+
+void HexagonUserDMA::Wait(uint32_t max_dmas_in_flight) {
+  // wait (forever) until max DMAs in flight <= actual DMAs in flight
+  while (DMAsInFlight() > max_dmas_in_flight) {
+  }
+}
+
+uint32_t HexagonUserDMA::Poll() { return DMAsInFlight(); }
+
+uint32_t HexagonUserDMA::DMAsInFlight() {
+  // poll DMA engine to make sure DMA status is current
+  dmpoll();
+
+  // find the oldest DMA in flight
+  for (; oldest_dma_in_flight_ < dma_descriptors_.size(); 
++oldest_dma_in_flight_) {
+// read the `done` bit from the DMA descriptor and stop if incomplete
+unsigned int done = 
dma_desc_get_done(dma_descriptors_[oldest_dma_in_flight_]);
+if (done == DESC_DONE_INCOMPLETE) {
+  break;
+}
+  }
+  // total DMAs in flight = total DMAs - oldest DMA in flight
+  return dma_descriptors_.size() - oldest_dma_in_flight_;
+}
+
+HexagonUserDMA::~HexagonUserDMA() {
+  Init();  // reset DMA engine
+  for (auto dma_desc : dma_descriptors_) {
+free(dma_desc);
   }
-#endif
-  return DMA_FAILURE;
 }
 
 int hexagon_user_dma_1d_sync(void* dst, void* src, uint32_t length) {

Review Comment:
   Marking this resolved as I am confident that deferring is the correct choice 
here.  Happy to be corrected if I'm wrong and this is needed now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-19 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r950496671


##
src/runtime/hexagon/hexagon_user_dma.cc:
##
@@ -90,32 +78,80 @@ int hexagon_user_dma_1d_sync_helper(void* dst, void* src, 
uint32_t length) {
   dma_desc_set_src(dma_desc, src32);
   dma_desc_set_dst(dma_desc, dst32);
 
-  dmstart(dma_desc);
-  unsigned int status = dmwait() & DM0_STATUS_MASK;
-  unsigned int done = dma_desc_get_done(dma_desc);
+  // only for first DMA
+  if (first_dma_) {
+// reset DMA engine
+auto status = Init();
+if (status != DM0_STATUS_IDLE) {
+  return DMA_FAILURE;
+}
+
+// `dmstart` first descriptor
+dmstart(dma_desc);
+first_dma_ = false;
+  } else {
+// `dmlink` descriptor to tail
+dmlink(dma_descriptors_.back(), dma_desc);
+  }
 
-  free(dma_desc);
+  // set descriptor as new tail
+  dma_descriptors_.push_back(dma_desc);
 
-  if (status == DM0_STATUS_IDLE && done == DESC_DONE_COMPLETE) {
-return DMA_SUCCESS;
+  return DMA_SUCCESS;
+}
+
+void HexagonUserDMA::Wait(uint32_t max_dmas_in_flight) {
+  // wait (forever) until max DMAs in flight <= actual DMAs in flight
+  while (DMAsInFlight() > max_dmas_in_flight) {
+  }
+}
+
+uint32_t HexagonUserDMA::Poll() { return DMAsInFlight(); }
+
+uint32_t HexagonUserDMA::DMAsInFlight() {
+  // poll DMA engine to make sure DMA status is current
+  dmpoll();
+
+  // find the oldest DMA in flight
+  for (; oldest_dma_in_flight_ < dma_descriptors_.size(); 
++oldest_dma_in_flight_) {
+// read the `done` bit from the DMA descriptor and stop if incomplete
+unsigned int done = 
dma_desc_get_done(dma_descriptors_[oldest_dma_in_flight_]);
+if (done == DESC_DONE_INCOMPLETE) {
+  break;
+}
+  }
+  // total DMAs in flight = total DMAs - oldest DMA in flight
+  return dma_descriptors_.size() - oldest_dma_in_flight_;
+}
+
+HexagonUserDMA::~HexagonUserDMA() {
+  Init();  // reset DMA engine
+  for (auto dma_desc : dma_descriptors_) {
+free(dma_desc);
   }
-#endif
-  return DMA_FAILURE;
 }
 
 int hexagon_user_dma_1d_sync(void* dst, void* src, uint32_t length) {

Review Comment:
   Marking this resolved as I am confident that deferring is the correct choice 
here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-19 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r950333067


##
src/runtime/hexagon/hexagon_user_dma.h:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include 
+
+#ifndef TVM_RUNTIME_HEXAGON_HEXAGON_USER_DMA_H_
+#define TVM_RUNTIME_HEXAGON_HEXAGON_USER_DMA_H_
+
+namespace tvm {
+namespace runtime {
+namespace hexagon {
+
+#define DMA_SUCCESS 0
+#define DMA_FAILURE -1
+
+class HexagonUserDMA {
+ public:
+  /*!
+   * \brief Initiate DMA to copy memory from source to destination address
+   * \param dst Destination address
+   * \param src Source address
+   * \param length Length in bytes to copy
+   * \returns Status, either DMA_SUCCESS or DMA_FAILURE
+   */
+  int Copy(void* dst, void* src, uint32_t length);
+
+  /*!
+   * \brief Wait until the number of DMAs in flight is less than or equal to 
some maximum
+   * \param max_dmas_in_flight Maximum number of DMAs allowed to be in flight
+   * to satisfy the `Wait` e.g. use `Wait(0)` to wait on "all" outstanding 
DMAs to complete
+   */
+  void Wait(uint32_t max_dmas_in_flight);
+
+  /*!
+   * \brief Poll the number of DMAs in flight
+   * \returns Number of DMAs in flight
+   */
+  uint32_t Poll();
+
+  //! HexagonUserDMA uses the singleton pattern
+  static HexagonUserDMA& Get() {
+static HexagonUserDMA* hud = new HexagonUserDMA();
+return *hud;
+  }
+
+ private:
+  HexagonUserDMA() = default;
+  ~HexagonUserDMA();
+  HexagonUserDMA(const HexagonUserDMA&) = delete;
+  HexagonUserDMA& operator=(const HexagonUserDMA&) = delete;
+  HexagonUserDMA(HexagonUserDMA&&) = delete;
+  HexagonUserDMA& operator=(HexagonUserDMA&&) = delete;
+
+  //! \brief Initializes / resets the Hexagon User DMA engine
+  unsigned int Init();
+
+  //! \brief Calculates the number of DMAs in flight
+  uint32_t DMAsInFlight();
+
+  //! \brief Stores descriptors for all DMAs
+  std::vector dma_descriptors_;

Review Comment:
   Maybe a good middle ground would be to `free` descriptors once we discover 
they are marked `done` in the `DMAsInFlight` method.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-18 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r949709289


##
src/runtime/hexagon/hexagon_user_dma.h:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include 
+
+#ifndef TVM_RUNTIME_HEXAGON_HEXAGON_USER_DMA_H_
+#define TVM_RUNTIME_HEXAGON_HEXAGON_USER_DMA_H_
+
+namespace tvm {
+namespace runtime {
+namespace hexagon {
+
+#define DMA_SUCCESS 0
+#define DMA_FAILURE -1
+
+class HexagonUserDMA {
+ public:
+  /*!
+   * \brief Initiate DMA to copy memory from source to destination address
+   * \param dst Destination address
+   * \param src Source address
+   * \param length Length in bytes to copy
+   * \returns Status, either DMA_SUCCESS or DMA_FAILURE
+   */
+  int Copy(void* dst, void* src, uint32_t length);
+
+  /*!
+   * \brief Wait until the number of DMAs in flight is less than or equal to 
some maximum
+   * \param max_dmas_in_flight Maximum number of DMAs allowed to be in flight
+   * to satisfy the `Wait` e.g. use `Wait(0)` to wait on "all" outstanding 
DMAs to complete
+   */
+  void Wait(uint32_t max_dmas_in_flight);
+
+  /*!
+   * \brief Poll the number of DMAs in flight
+   * \returns Number of DMAs in flight
+   */
+  uint32_t Poll();
+
+  //! HexagonUserDMA uses the singleton pattern
+  static HexagonUserDMA& Get() {
+static HexagonUserDMA* hud = new HexagonUserDMA();
+return *hud;
+  }
+
+ private:
+  HexagonUserDMA() = default;
+  ~HexagonUserDMA();
+  HexagonUserDMA(const HexagonUserDMA&) = delete;
+  HexagonUserDMA& operator=(const HexagonUserDMA&) = delete;
+  HexagonUserDMA(HexagonUserDMA&&) = delete;
+  HexagonUserDMA& operator=(HexagonUserDMA&&) = delete;
+
+  //! \brief Initializes / resets the Hexagon User DMA engine
+  unsigned int Init();
+
+  //! \brief Calculates the number of DMAs in flight
+  uint32_t DMAsInFlight();
+
+  //! \brief Stores descriptors for all DMAs
+  std::vector dma_descriptors_;

Review Comment:
   Possible issue - @csullivan informs me that the destructor may not get 
called for Hexagon runtime classes given that we kill the RPC to avoid zombie 
processes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-18 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r949709014


##
src/runtime/hexagon/hexagon_user_dma_instructions.h:
##
@@ -24,8 +24,6 @@ namespace tvm {
 namespace runtime {
 namespace hexagon {
 
-#if __HEXAGON_ARCH__ >= 68

Review Comment:
   Removed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-18 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r949031227


##
src/runtime/hexagon/hexagon_user_dma.cc:
##
@@ -90,32 +78,80 @@ int hexagon_user_dma_1d_sync_helper(void* dst, void* src, 
uint32_t length) {
   dma_desc_set_src(dma_desc, src32);
   dma_desc_set_dst(dma_desc, dst32);
 
-  dmstart(dma_desc);
-  unsigned int status = dmwait() & DM0_STATUS_MASK;
-  unsigned int done = dma_desc_get_done(dma_desc);
+  // only for first DMA
+  if (first_dma_) {
+// reset DMA engine
+auto status = Init();
+if (status != DM0_STATUS_IDLE) {
+  return DMA_FAILURE;
+}
+
+// `dmstart` first descriptor
+dmstart(dma_desc);
+first_dma_ = false;
+  } else {
+// `dmlink` descriptor to tail
+dmlink(dma_descriptors_.back(), dma_desc);
+  }
 
-  free(dma_desc);
+  // set descriptor as new tail
+  dma_descriptors_.push_back(dma_desc);
 
-  if (status == DM0_STATUS_IDLE && done == DESC_DONE_COMPLETE) {
-return DMA_SUCCESS;
+  return DMA_SUCCESS;
+}
+
+void HexagonUserDMA::Wait(uint32_t max_dmas_in_flight) {
+  // wait (forever) until max DMAs in flight <= actual DMAs in flight
+  while (DMAsInFlight() > max_dmas_in_flight) {
+  }
+}
+
+uint32_t HexagonUserDMA::Poll() { return DMAsInFlight(); }
+
+uint32_t HexagonUserDMA::DMAsInFlight() {
+  // poll DMA engine to make sure DMA status is current
+  dmpoll();
+
+  // find the oldest DMA in flight
+  for (; oldest_dma_in_flight_ < dma_descriptors_.size(); 
++oldest_dma_in_flight_) {
+// read the `done` bit from the DMA descriptor and stop if incomplete
+unsigned int done = 
dma_desc_get_done(dma_descriptors_[oldest_dma_in_flight_]);
+if (done == DESC_DONE_INCOMPLETE) {
+  break;
+}
+  }
+  // total DMAs in flight = total DMAs - oldest DMA in flight
+  return dma_descriptors_.size() - oldest_dma_in_flight_;
+}
+
+HexagonUserDMA::~HexagonUserDMA() {
+  Init();  // reset DMA engine
+  for (auto dma_desc : dma_descriptors_) {
+free(dma_desc);
   }
-#endif
-  return DMA_FAILURE;
 }
 
 int hexagon_user_dma_1d_sync(void* dst, void* src, uint32_t length) {

Review Comment:
   My thought was to do this in the next phase of development when we work on 
TIR lowering to Async DMA.  What do you think @csullivan?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-18 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r949707780


##
tests/cpp-runtime/hexagon/hexagon_user_dma_tests.cc:
##
@@ -0,0 +1,141 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include 
+
+#include "../src/runtime/hexagon/hexagon_user_dma.h"
+
+using namespace tvm::runtime;
+using namespace tvm::runtime::hexagon;
+
+class HexagonUserDMATest : public ::testing::Test {
+  void SetUp() override {
+src = malloc(length);
+dst = malloc(length);
+ASSERT_NE(src, nullptr);
+ASSERT_NE(dst, nullptr);
+
+src_char = static_cast(src);
+dst_char = static_cast(dst);
+for (uint32_t i = 0; i < length; ++i) {
+  src_char[i] = 1;
+  dst_char[i] = 0;
+}
+  }
+  void TearDown() override {
+free(src);
+free(dst);
+  }
+
+ public:
+  int ret{0};
+  void* src{nullptr};
+  void* dst{nullptr};
+  char* src_char{nullptr};
+  char* dst_char{nullptr};
+  uint32_t length{0x40};  // 4MB
+};
+
+TEST_F(HexagonUserDMATest, wait) {
+  HexagonUserDMA::Get().Wait(0);
+  HexagonUserDMA::Get().Wait(10);
+}
+
+TEST_F(HexagonUserDMATest, poll) { ASSERT_EQ(HexagonUserDMA::Get().Poll(), 0); 
}
+
+TEST_F(HexagonUserDMATest, bad_copy) {
+  uint64_t bigaddr = 0x1;
+  void* src64 = reinterpret_cast(bigaddr);
+  void* dst64 = reinterpret_cast(bigaddr);
+  uint32_t biglength = 0x100;
+  ASSERT_NE(HexagonUserDMA::Get().Copy(dst64, src, length), DMA_SUCCESS);
+  ASSERT_NE(HexagonUserDMA::Get().Copy(dst, src64, length), DMA_SUCCESS);
+  ASSERT_NE(HexagonUserDMA::Get().Copy(dst, src, biglength), DMA_SUCCESS);
+}
+
+TEST_F(HexagonUserDMATest, sync_dma) {
+  // kick off 1 DMA
+  ret = HexagonUserDMA::Get().Copy(dst, src, length);
+  ASSERT_EQ(ret, DMA_SUCCESS);
+
+  // wait for DMA to complete
+  HexagonUserDMA::Get().Wait(0);
+
+  // verify
+  for (uint32_t i = 0; i < length; ++i) {
+ASSERT_EQ(src_char[i], dst_char[i]);
+  }
+}
+
+TEST_F(HexagonUserDMATest, async_dma) {
+  // kick off 10x duplicate DMAs
+  for (uint32_t i = 0; i < 10; ++i) {
+ret = HexagonUserDMA::Get().Copy(dst, src, length);
+ASSERT_EQ(ret, DMA_SUCCESS);
+  }
+
+  // verify at least 1 DMA in flight
+  // TODO: re-enable when CI runs on hardware - fails on simulator
+  // ASSERT_GT(HexagonUserDMA::Get().Poll(), 0);

Review Comment:
   The more I think about it, this check for an "in flight" count greater than 
0 does not really add any value.  On the simulator, the "in flight" count is 
always 0 as if DMAs complete automatically so the test fails.  On hardware, the 
"in flight" count should be non-zero but we have other ways to validate the 
polling mechanism is working correctly.  I added `async_dma_poll` test case 
which has the correct coverage, I think.  Let me know what you think @csullivan.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-18 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r949489424


##
src/runtime/hexagon/hexagon_user_dma.h:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include 
+
+#ifndef TVM_RUNTIME_HEXAGON_HEXAGON_USER_DMA_H_
+#define TVM_RUNTIME_HEXAGON_HEXAGON_USER_DMA_H_
+
+namespace tvm {
+namespace runtime {
+namespace hexagon {
+
+#define DMA_SUCCESS 0
+#define DMA_FAILURE -1
+
+class HexagonUserDMA {
+ public:
+  /*!
+   * \brief Initiate DMA to copy memory from source to destination address
+   * \param dst Destination address
+   * \param src Source address
+   * \param length Length in bytes to copy
+   * \returns Status, either DMA_SUCCESS or DMA_FAILURE
+   */
+  int Copy(void* dst, void* src, uint32_t length);
+
+  /*!
+   * \brief Wait until the number of DMAs in flight is less than or equal to 
some maximum
+   * \param max_dmas_in_flight Maximum number of DMAs allowed to be in flight
+   * to satisfy the `Wait` e.g. use `Wait(0)` to wait on "all" outstanding 
DMAs to complete
+   */
+  void Wait(uint32_t max_dmas_in_flight);
+
+  /*!
+   * \brief Poll the number of DMAs in flight
+   * \returns Number of DMAs in flight
+   */
+  uint32_t Poll();
+
+  //! HexagonUserDMA uses the singleton pattern
+  static HexagonUserDMA& Get() {
+static HexagonUserDMA* hud = new HexagonUserDMA();
+return *hud;
+  }
+
+ private:
+  HexagonUserDMA() = default;
+  ~HexagonUserDMA();
+  HexagonUserDMA(const HexagonUserDMA&) = delete;
+  HexagonUserDMA& operator=(const HexagonUserDMA&) = delete;
+  HexagonUserDMA(HexagonUserDMA&&) = delete;
+  HexagonUserDMA& operator=(HexagonUserDMA&&) = delete;
+
+  //! \brief Initializes / resets the Hexagon User DMA engine
+  unsigned int Init();
+
+  //! \brief Calculates the number of DMAs in flight
+  uint32_t DMAsInFlight();
+
+  //! \brief Stores descriptors for all DMAs
+  std::vector dma_descriptors_;

Review Comment:
   Yes, in the destructor:
   ```
   HexagonUserDMA::~HexagonUserDMA() {
 Init();  // reset DMA engine
 for (auto dma_desc : dma_descriptors_) {
   free(dma_desc);
 }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-18 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r949484536


##
src/runtime/hexagon/hexagon_user_dma_instructions.h:
##
@@ -24,8 +24,6 @@ namespace tvm {
 namespace runtime {
 namespace hexagon {
 
-#if __HEXAGON_ARCH__ >= 68

Review Comment:
   OK, I will revert the checks for `#if __HEXAGON_ARCH__ >= 68` assuming it's 
a workable plan to require the TVM runtime to be compiled for v68 and above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-18 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r949483132


##
src/runtime/hexagon/hexagon_user_dma.h:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include 

Review Comment:
   Good catch, done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-18 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r949031227


##
src/runtime/hexagon/hexagon_user_dma.cc:
##
@@ -90,32 +78,80 @@ int hexagon_user_dma_1d_sync_helper(void* dst, void* src, 
uint32_t length) {
   dma_desc_set_src(dma_desc, src32);
   dma_desc_set_dst(dma_desc, dst32);
 
-  dmstart(dma_desc);
-  unsigned int status = dmwait() & DM0_STATUS_MASK;
-  unsigned int done = dma_desc_get_done(dma_desc);
+  // only for first DMA
+  if (first_dma_) {
+// reset DMA engine
+auto status = Init();
+if (status != DM0_STATUS_IDLE) {
+  return DMA_FAILURE;
+}
+
+// `dmstart` first descriptor
+dmstart(dma_desc);
+first_dma_ = false;
+  } else {
+// `dmlink` descriptor to tail
+dmlink(dma_descriptors_.back(), dma_desc);
+  }
 
-  free(dma_desc);
+  // set descriptor as new tail
+  dma_descriptors_.push_back(dma_desc);
 
-  if (status == DM0_STATUS_IDLE && done == DESC_DONE_COMPLETE) {
-return DMA_SUCCESS;
+  return DMA_SUCCESS;
+}
+
+void HexagonUserDMA::Wait(uint32_t max_dmas_in_flight) {
+  // wait (forever) until max DMAs in flight <= actual DMAs in flight
+  while (DMAsInFlight() > max_dmas_in_flight) {
+  }
+}
+
+uint32_t HexagonUserDMA::Poll() { return DMAsInFlight(); }
+
+uint32_t HexagonUserDMA::DMAsInFlight() {
+  // poll DMA engine to make sure DMA status is current
+  dmpoll();
+
+  // find the oldest DMA in flight
+  for (; oldest_dma_in_flight_ < dma_descriptors_.size(); 
++oldest_dma_in_flight_) {
+// read the `done` bit from the DMA descriptor and stop if incomplete
+unsigned int done = 
dma_desc_get_done(dma_descriptors_[oldest_dma_in_flight_]);
+if (done == DESC_DONE_INCOMPLETE) {
+  break;
+}
+  }
+  // total DMAs in flight = total DMAs - oldest DMA in flight
+  return dma_descriptors_.size() - oldest_dma_in_flight_;
+}
+
+HexagonUserDMA::~HexagonUserDMA() {
+  Init();  // reset DMA engine
+  for (auto dma_desc : dma_descriptors_) {
+free(dma_desc);
   }
-#endif
-  return DMA_FAILURE;
 }
 
 int hexagon_user_dma_1d_sync(void* dst, void* src, uint32_t length) {

Review Comment:
   My thought was to do this in the next phase of development when we work on 
TIR lowering to Async DMA.  What do you think?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-18 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r949037588


##
tests/cpp-runtime/hexagon/hexagon_user_dma_tests.cc:
##
@@ -0,0 +1,141 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include 
+
+#include "../src/runtime/hexagon/hexagon_user_dma.h"
+
+using namespace tvm::runtime;
+using namespace tvm::runtime::hexagon;
+
+class HexagonUserDMATest : public ::testing::Test {
+  void SetUp() override {
+src = malloc(length);
+dst = malloc(length);
+ASSERT_NE(src, nullptr);
+ASSERT_NE(dst, nullptr);
+
+src_char = static_cast(src);
+dst_char = static_cast(dst);
+for (uint32_t i = 0; i < length; ++i) {
+  src_char[i] = 1;
+  dst_char[i] = 0;
+}
+  }
+  void TearDown() override {
+free(src);
+free(dst);
+  }
+
+ public:
+  int ret{0};
+  void* src{nullptr};
+  void* dst{nullptr};
+  char* src_char{nullptr};
+  char* dst_char{nullptr};
+  uint32_t length{0x40};  // 4MB
+};
+
+TEST_F(HexagonUserDMATest, wait) {
+  HexagonUserDMA::Get().Wait(0);
+  HexagonUserDMA::Get().Wait(10);
+}
+
+TEST_F(HexagonUserDMATest, poll) { ASSERT_EQ(HexagonUserDMA::Get().Poll(), 0); 
}
+
+TEST_F(HexagonUserDMATest, bad_copy) {
+  uint64_t bigaddr = 0x1;
+  void* src64 = reinterpret_cast(bigaddr);
+  void* dst64 = reinterpret_cast(bigaddr);
+  uint32_t biglength = 0x100;
+  ASSERT_NE(HexagonUserDMA::Get().Copy(dst64, src, length), DMA_SUCCESS);
+  ASSERT_NE(HexagonUserDMA::Get().Copy(dst, src64, length), DMA_SUCCESS);
+  ASSERT_NE(HexagonUserDMA::Get().Copy(dst, src, biglength), DMA_SUCCESS);
+}
+
+TEST_F(HexagonUserDMATest, sync_dma) {
+  // kick off 1 DMA
+  ret = HexagonUserDMA::Get().Copy(dst, src, length);
+  ASSERT_EQ(ret, DMA_SUCCESS);
+
+  // wait for DMA to complete
+  HexagonUserDMA::Get().Wait(0);
+
+  // verify
+  for (uint32_t i = 0; i < length; ++i) {
+ASSERT_EQ(src_char[i], dst_char[i]);
+  }
+}
+
+TEST_F(HexagonUserDMATest, async_dma) {
+  // kick off 10x duplicate DMAs
+  for (uint32_t i = 0; i < 10; ++i) {
+ret = HexagonUserDMA::Get().Copy(dst, src, length);
+ASSERT_EQ(ret, DMA_SUCCESS);
+  }
+
+  // verify at least 1 DMA in flight
+  // TODO: re-enable when CI runs on hardware - fails on simulator
+  // ASSERT_GT(HexagonUserDMA::Get().Poll(), 0);

Review Comment:
   Yes, good thought.  Let me see if I can take care of this `TODO` today.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-18 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r949036566


##
src/runtime/hexagon/hexagon_user_dma.cc:
##
@@ -90,32 +78,80 @@ int hexagon_user_dma_1d_sync_helper(void* dst, void* src, 
uint32_t length) {
   dma_desc_set_src(dma_desc, src32);
   dma_desc_set_dst(dma_desc, dst32);
 
-  dmstart(dma_desc);
-  unsigned int status = dmwait() & DM0_STATUS_MASK;
-  unsigned int done = dma_desc_get_done(dma_desc);
+  // only for first DMA
+  if (first_dma_) {
+// reset DMA engine
+auto status = Init();
+if (status != DM0_STATUS_IDLE) {
+  return DMA_FAILURE;
+}
+
+// `dmstart` first descriptor
+dmstart(dma_desc);
+first_dma_ = false;
+  } else {
+// `dmlink` descriptor to tail
+dmlink(dma_descriptors_.back(), dma_desc);

Review Comment:
   From what I can tell, there is no limit.  Perhaps @kparzysz-quic or 
@arangasa have a comment here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-18 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r949035189


##
src/runtime/hexagon/hexagon_user_dma.cc:
##
@@ -28,55 +30,41 @@ namespace tvm {
 namespace runtime {
 namespace hexagon {
 
-int init_hexagon_user_dma() {
-#if __HEXAGON_ARCH__ >= 68
+unsigned int HexagonUserDMA::Init() {
   // reset DMA engine
   unsigned int status = dmpause() & DM0_STATUS_MASK;
-  if (status != DM0_STATUS_IDLE) {
-return DMA_FAILURE;
-  }
-#endif
-  return DMA_SUCCESS;
+  return status;
 }
 
-int hexagon_user_dma_1d_sync_helper(void* dst, void* src, uint32_t length) {
-#if __HEXAGON_ARCH__ >= 68
-  static int config_dma = init_hexagon_user_dma();
-  if (config_dma != DMA_SUCCESS) {
+int HexagonUserDMA::Copy(void* dst, void* src, uint32_t length) {
+  // length limited to 24 bits
+  if (length > DESC_LENGTH_MASK) {
 return DMA_FAILURE;
   }
 
-  uint64_t src64 = reinterpret_cast(src);
   // source address limited to 32 bits
-  if (src64 > DESC_SRC_MASK) {
+  uint64_t src64 = reinterpret_cast(src);
+  if (!src64 || src64 > DESC_SRC_MASK) {
 return DMA_FAILURE;
   }
 
-  uint64_t dst64 = reinterpret_cast(dst);
   // destination address limited to 32 bits
-  if (dst64 > DESC_DST_MASK) {
-return DMA_FAILURE;
-  }
-
-  // length limited to 24 bits
-  if (length > DESC_LENGTH_MASK) {
+  uint64_t dst64 = reinterpret_cast(dst);
+  if (!dst64 || dst64 > DESC_DST_MASK) {
 return DMA_FAILURE;
   }
 
-  uint32_t src32 = src64 & DESC_SRC_MASK;
-  uint32_t dst32 = dst64 & DESC_DST_MASK;
+  uint32_t src32 = static_cast(src64);
+  uint32_t dst32 = static_cast(dst64);
 
+  // allocate new descriptor
   void* dma_desc = nullptr;
-
   int ret = posix_memalign(_desc, DMA_DESC_2D_SIZE, DMA_DESC_2D_SIZE);
-  if (ret) {

Review Comment:
   Added to the backlog



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-18 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r949031702


##
src/runtime/hexagon/hexagon_user_dma.cc:
##
@@ -90,32 +78,80 @@ int hexagon_user_dma_1d_sync_helper(void* dst, void* src, 
uint32_t length) {
   dma_desc_set_src(dma_desc, src32);
   dma_desc_set_dst(dma_desc, dst32);
 
-  dmstart(dma_desc);
-  unsigned int status = dmwait() & DM0_STATUS_MASK;
-  unsigned int done = dma_desc_get_done(dma_desc);
+  // only for first DMA
+  if (first_dma_) {
+// reset DMA engine
+auto status = Init();
+if (status != DM0_STATUS_IDLE) {
+  return DMA_FAILURE;
+}
+
+// `dmstart` first descriptor
+dmstart(dma_desc);
+first_dma_ = false;
+  } else {
+// `dmlink` descriptor to tail
+dmlink(dma_descriptors_.back(), dma_desc);
+  }
 
-  free(dma_desc);
+  // set descriptor as new tail
+  dma_descriptors_.push_back(dma_desc);
 
-  if (status == DM0_STATUS_IDLE && done == DESC_DONE_COMPLETE) {
-return DMA_SUCCESS;
+  return DMA_SUCCESS;
+}
+
+void HexagonUserDMA::Wait(uint32_t max_dmas_in_flight) {
+  // wait (forever) until max DMAs in flight <= actual DMAs in flight
+  while (DMAsInFlight() > max_dmas_in_flight) {

Review Comment:
   I agree completely.  Had the same thought about a buried log message.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] adstraw commented on a diff in pull request #12411: [Hexagon] Asynchronous DMA support

2022-08-18 Thread GitBox


adstraw commented on code in PR #12411:
URL: https://github.com/apache/tvm/pull/12411#discussion_r949031227


##
src/runtime/hexagon/hexagon_user_dma.cc:
##
@@ -90,32 +78,80 @@ int hexagon_user_dma_1d_sync_helper(void* dst, void* src, 
uint32_t length) {
   dma_desc_set_src(dma_desc, src32);
   dma_desc_set_dst(dma_desc, dst32);
 
-  dmstart(dma_desc);
-  unsigned int status = dmwait() & DM0_STATUS_MASK;
-  unsigned int done = dma_desc_get_done(dma_desc);
+  // only for first DMA
+  if (first_dma_) {
+// reset DMA engine
+auto status = Init();
+if (status != DM0_STATUS_IDLE) {
+  return DMA_FAILURE;
+}
+
+// `dmstart` first descriptor
+dmstart(dma_desc);
+first_dma_ = false;
+  } else {
+// `dmlink` descriptor to tail
+dmlink(dma_descriptors_.back(), dma_desc);
+  }
 
-  free(dma_desc);
+  // set descriptor as new tail
+  dma_descriptors_.push_back(dma_desc);
 
-  if (status == DM0_STATUS_IDLE && done == DESC_DONE_COMPLETE) {
-return DMA_SUCCESS;
+  return DMA_SUCCESS;
+}
+
+void HexagonUserDMA::Wait(uint32_t max_dmas_in_flight) {
+  // wait (forever) until max DMAs in flight <= actual DMAs in flight
+  while (DMAsInFlight() > max_dmas_in_flight) {
+  }
+}
+
+uint32_t HexagonUserDMA::Poll() { return DMAsInFlight(); }
+
+uint32_t HexagonUserDMA::DMAsInFlight() {
+  // poll DMA engine to make sure DMA status is current
+  dmpoll();
+
+  // find the oldest DMA in flight
+  for (; oldest_dma_in_flight_ < dma_descriptors_.size(); 
++oldest_dma_in_flight_) {
+// read the `done` bit from the DMA descriptor and stop if incomplete
+unsigned int done = 
dma_desc_get_done(dma_descriptors_[oldest_dma_in_flight_]);
+if (done == DESC_DONE_INCOMPLETE) {
+  break;
+}
+  }
+  // total DMAs in flight = total DMAs - oldest DMA in flight
+  return dma_descriptors_.size() - oldest_dma_in_flight_;
+}
+
+HexagonUserDMA::~HexagonUserDMA() {
+  Init();  // reset DMA engine
+  for (auto dma_desc : dma_descriptors_) {
+free(dma_desc);
   }
-#endif
-  return DMA_FAILURE;
 }
 
 int hexagon_user_dma_1d_sync(void* dst, void* src, uint32_t length) {

Review Comment:
   My thought was to do this in the next phase of development when we work on 
TIR lowering to Async DMA.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org