Data Science

AI/ML

  • OpenChatKit : OpenChatKit provides a powerful, open-source base to create both specialized and general purpose chatbots for various applications
  • SkyPilot : SkyPilot is a framework for easily and cost effectively running ML workloads on any cloud
  • UniLM : Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Books

Cyber Analytics

Data

Tools

Platforms/Suites

  • Brim : Novel data munging and analysis platform, Built on [zed[(https://github.com/brimdata/zed), a super-structured data model. *
  • RapidMiner : Best in class data analytics tool suite
  • Stroom : Stroom is a highly scalable data storage, processing and analysis platform.

Notebooks in the Cloud

Visualization Platforms

  • grafana : Operational dashboards for your data here, there, or anywhere
  • metabase : The simplest, fastest way to get business intelligence and analytics to everyone in your compan
  • redash : Connect to any data source, easily visualize, dashboard and share your data.
  • superset : Apache Superset is a modern data exploration and visualization platform

Datascience Worflow Tools

  • Katib : Katib is a Kubernetes-native project for automated machine learning (AutoML). Katib supports Hyperparameter Tuning, Early Stopping and Neural Architecture Search
  • Kedro : Kedro is an open-source Python framework to create reproducible, maintainable, and modular data science code
  • MetaFlow: A human-friendly Python library that makes it straightforward to develop, deploy, and operate various kinds of data-intensive applications, in particular those involving data science and ML
  • Ploomber : Framework to build collaborative and modular pipelines; it integrates with Jupyter but you can use it with any other editor
  • Seldon : An open source platform to deploy your machine learning models on Kubernetes at massive scale.