Mining Sets of Patterns:

Next Generation Pattern Mining

a tutorial at ICDM 2011, Vancouver, Canada

Tutorial by

by Björn Bringmann, Siegfried Nijssen, Nikolaj Tatti, Jilles Vreeken, and Albrecht Zimmermann

Wednesday, 14th of December 2011

Abstract

Pattern mining is one of the most important topics in data mining. The core idea is to extract relevant "nuggets" of knowledge describing parts of a database. However, many traditional (frequent) pattern mining algorithms find patterns in numbers too large to be of practical value: so many "nuggets" of knowledge are found that they do not combine into a better global understanding of the data. In fact, often the number of discovered patterns is larger than the size of the original database!

To tackle this problem, many techniques have been developed in recent years for finding not all, but useful sets of patterns. The aim of this tutorial is to provide a general, comprehensive overview of the state-of-the-art of mining such high-quality sets of patterns.

We will provide an overview of these methods and results, generalizing over different data types (from itemsets to graphs) and over multiple tasks (from unsupervised to supervised). The main contributions of the tutorial include an exploration of the relationships between classic machine learning and recent pattern set mining algorithms, as well as an overview of the connections between traditional pattern mining and modern pattern set discovery algorithms.

As a key contribution we will give a general framework; rather than listing independent algorithms, we will identify the key concepts of pattern set mining, and will show how many methods exploit these concepts. For the convenience of the attendees, as well as those interested in the topic in general, we have established a website on which the main approaches will be succinctly explained, references to the main proponents per approach will be given, including, if available, links to implementations.

Schedule

The tutorial will last 3 hours in total on Wednesday, the 14th of December 2011. Location will be the 'Ports of SF/NY' room in the Renaissance hotel. Please find the tentative schedule below.

10.10-11.00   Part 1: Introduction and Agnosticity   by Siegfried Nijssen (K.U. Leuven)
  • What is Pattern Mining?
  • What is Pattern Set Mining?
  • Plug-in criterion Pattern Set Mining Methods
11.00-12.00   Part 2: Unsupervised Pattern Set Mining   by Jilles Vreeken (University of Antwerp)
  • Deviation-based Pattern Set Mining
  • Description-based Pattern Set Mining
12.00-13.30  
    Lunch
 
13.30-14.30   Part 3: Supervised Pattern Set Mining   by Albrecht Zimmermann (K.U. Leuven)
  • Mining Target-Correlated Sets of Patterns

Meet the Tutors

The authors of the tutorial are, in alphabetical order, Björn Bringmann, Siegfried Nijssen, Nikolaj Tatti, Jilles Vreeken, and Albrecht Zimmermann. All have extensive experience in research on, and presenting about, mining sets of patterns, and putting these patterns to good use. For clarity in presentation we plan to present the tutorial with 3 speakers, one for each of the sections; Siegfried Nijssen, Jilles Vreeken, and Albrecht Zimmermann, respectively.

Slides and handouts

We present a companion site to give a more complete and more well-structured overview of the references given in the tutorial. We will not provide printed reference lists at the tutorial.

part one

 

part two

 

part three

 
 
download
 
download
 
download

Registration/Vancouver/Venue

As this tutorial is organised in context of ICDM 2011, attendance of the tutorial requires registration for the conference. Similar we refer to the main sites regarding Vancouver and the Conference Venue.