Hi Tao,

It's written with being able to support multiple workspaces in mind, but the code may not work 100%, so I'll need to take a look :)

Roughly it's get a data export as an admin, and then install the slack app in your workspace, and then I make some config changes on the install.

Ping me if I don't get back to you in a week :)

-ash

On Wed, 14 Apr, 2021 at 10:48, Tao Feng <fengta...@gmail.com> wrote:
Hey Ash,

That's great! I am interested in using for my other open source project (<https://github.com/amundsen-io/amundsen>) which is currently hitting the 10k message limit as well. I wonder what it takes to setup/enable for other slack workspace?

Thanks,
-Tao

On Wed, Apr 14, 2021 at 5:40 AM Ash Berlin-Taylor <a...@apache.org <mailto:a...@apache.org>> wrote:
Hello everyone,

Thanks to prompting from Sumit, I have "resurrected" a project I started back in 2019, and have got searchable slack archives available:

<https://apache-airflow.slack-archives.org/>

(This is a fancy looking URL, there is nothing else on the domain yet. Any other projects want this too?)

A little known fact of Slack is that the export an admin can do contains /all/ messages, not just the ones the client will show, so this has all 130k+ messages in the DB. For example <https://apache-airflow.slack-archives.org/announcements/page-1> for example shows the very first message from Fokko

This service also has a bot user, called Archie the Archive Bot, that if invited to channels will listen for messages (and deletions/edits etc). To get this bot in the channel we need to run /invite @archie -- I'm not sure if only workspace admins can do this or if anyone can.

Features that are missing/broken/confusing right now:
- Thread replies aren't handled visually correctly -- rather than being nested under the original message they are just shown like a normal message.
- No ability to permalink to a specific message
- Markdown formatting in messages might be incomplete
- Shared files/images may or may not be accessible. I haven't really tested it.
- It needs a privacy policy/data protection statement
- Since it's using VueJS (the project I forked this froms choice, not mine) it's probably not indexable by search engines. - The front end is only showing 10k messages per channel (<https://apache-airflow.slack-archives.org/random/page-1>) -- given it's paged already there's no need for this limitation to exist.
- Links to users don't go anywhere sensible.

I'm sure there are many more small gotchas, but I didn't want perfect to be the enemy of the good.

The code for this service lives at <https://github.com/ashb/slackarchive> - PRs welcome ;) Most of the readme is still wrong there.

If anyone isn't happy with this I can delete it, or set certain channels to not be archived etc.

-ash

Reply via email to