Get Started With Data Mining Now > > Intelligent Enterprise: Better Insight for Business Decisions

Intelligent Enterprise

Better Insight for Business Decisions

Intelligent Enterprise - Better Insight for Business Decisions
search Intelligent Enterprise
Advanced Search
RSS
Webcasts
Digital Library
Subscribe
Home


Get Started With Data Mining Now


Data mining is widespread because it works. It can improve an organization's ability to reach its goals. Its popularity is also rising because the tools are better, more broadly available, cheaper and easier to use.


By Warren Thornthwaite
October 1, 2005

Data mining has come into its own over the past decade, taking a central role in many businesses. We're all the subject of data mining dozens of times a day—from the direct mail we receive to the fraud-detection algorithms that scrutinize our every credit card purchase.

Data mining is widespread because it works. The techniques can significantly improve an organization's ability to reach its goals. Its popularity is also rising because the tools are better, more broadly available, cheaper and easier to use.

Many data warehouse/business intelligence (DW/BI) teams aren't sure how to get started with data mining. This column presents a business-based approach that will help you successfully add data mining to your DW/BI system.

The data mining process must begin with an understanding of business opportunities. The diagram on the right shows the three phases of the data mining process, major task areas within those phases and common iteration points.

The Business Phase

This first phase is a more focused version of the overall BI/DW requirements gathering process. Identify and prioritize a list of opportunities that can have a significant business impact. The business opportunities and data understanding tasks in the diagram connect because identifying opportunities must be connected to the realities of the data world. By the same token, the data itself may suggest business opportunities.

As always, the most important step in successful BI isn't about technology, it's about understanding the business. Meet with businesspeople about potential opportunities and the associated relationships and behaviors captured in the data. The goal of these meetings is to identify several high-value opportunities and carefully examine each one. First, describe business objectives in measurable ways. "Increase sales" is too broad—"reduce the monthly churn rate" is more manageable. Next, think about what factors influence the objective. What might indicate that someone is likely to churn? How can you tell if someone would be interested in a given product? While you're discussing these factors, try to translate them into specific attributes and behaviors that are known to exist in a usable, accessible form.

After several meetings with different groups to identify and prioritize a range of opportunities, take the top-priority business opportunity and its associated list of potential variables back to the DW for further exploration. Spend a lot of time exploring the data sets that might be relevant to the business opportunities discussed. At this stage, the goal is to verify that the data needed to support the business opportunity is available and clean enough to be usable.

You can discover many of the content, relationship and quality problems firsthand through data profiling—using query and reporting tools to get a sense of the content under investigation. While data profiling can be as simple as writing some SQL SELECT statements with COUNTs and DISTINCTs, several data profiling tools can provide complex analysis that goes well beyond simple queries.

Once you have a clear, viable opportunity identified, document the following:

  • Business opportunity description
  • Expected data issues
  • Modeling process description
  • Implementation plan
  • Maintenance plan.

Finally, review the opportunity and documentation with businesspeople to make sure you understand their needs and they understand how you intend to meet them.

The Data Mining Phase

Now you get to build some data mining models. The three major tasks in this phase involve preparing the data, developing alternative models and comparing their accuracy, and validating the final model. As shown in the diagram at right, this is a highly iterative process.

The first task in this phase is to build the data mining case sets. A case set includes one row per instance or event. For many data mining models, this means a data set with one row per customer. Models based on simple customer attributes, such as gender and marital status, work at the one-row-per-customer level. Models that include repeated behaviors, such as purchases, include data at the one-row-per-event level.

A well-designed and built dimensional DW is a perfect source for data mining case data. Ideally, many variables identified in the business opportunity already exist as cleansed DW attributes—often true with fields such as customer_type or product_color. The data miner's world gets even better when demographics and other external data are already loaded into the DW using conformed dimensions.




 





New on the BLOG
Enterprise 2.0: What Really Changes?
10. 6.2008
blog author
Neil Raden
I was asked to be a part of a panel discussing Enterprise 2.0 platforms... What I picked up is that the idea of building community is pretty key, as well as understanding the changing sensitivities and work habits of the younger workforce... But where do people find the time to use social networking on top of their already jammed schedules?

Read more from Neil Raden >>

Curt Monash
HP-Oracle Appliance Prices Estimated
I've been trying to figure out how much the HP-Oracle Database Machine and HP-Oracle Exadata Storage Server actually cost. I've updated my pricing spreadsheet... and my new estimate for HP Oracle Database Machine list price is $5,546,000. Per-terabyte prices (user data) are $60K and $198K for the two configurations.

10. 3.2008
Read more from Curt Monash >>

Oracle 'Interoperates, Integrates and Unifies' Business Process Management
10. 3.2008
blog author
Bruce Silver
At Oracle Open World last week, industry analysts got a good look at Oracle's BPM strategy and roadmap in the wake of the BEA acquisition. Overall, my conclusion is Oracle is showing the rest of the world the right way to do software acquisitions. BPM is progressing along the path of "interoperate, integrate, unify" that Oracle claims it tries to follow with all of its acquisitions.

Read more from Bruce Silver >>



IE Weekly Newsletter
Subscribe to the newsletter
    Email Address



InformationWeek Business Technology Network
InformationWeekInformationWeek 500InformationWeek 500 ConferenceInformationWeek AnalyticsInformationWeek CIO
InformationWeek EventsInformationWeek ReportsInformationWeek MagazinebMightyByte and SwitchDark Reading
Digital LibraryIntelligent EnterpriseInternet EvolutionNetwork ComputingNo Jitter
space
Techweb Events Network
InteropVoiceConWeb 2.0 ExpoWeb 2.0 SummitEnterprise 2.0 ConferenceMobile Business ExpoSoftware ConferenceCSI - Computer Security Institute
Black HatGTECEnergy CampMashup CampStartup Camp
space
Light Reading Communications Network
Light ReadingLight Reading EuropeUnstrungLight Reading's Cable Digital NewsConstantinopleInternet Evolution
Heavy ReadingLight Reading Live!Light Reading InsiderEthernet ExpoOptical ExpoTeleco TVTower Technology Summit
space
Financial Technology Network
Advanced TradingBank Systems & TechnologyInsurance & TechnologyWall Street & TechnologyAccelerating Wall StreetBank Systems & Technology Executive SummitBuyside Trading SummitInsurance & Technology Executive Summit
space
Microsoft Technology Network
MSDN MagazineTechNetThe Architecture Journal
space